Conversation
Provides a list of all fields extractable from a compound variable, along with the Operations required to extract them. The immediate use case is for extracting fields from AcceptChannel compound variables when generating programs for the open_channel -> accept_channel flow. Eventually we will also use this for other compound variables.
ProgramBuilder is the toolkit for building Programs and is intended to be used by generators and mutators. It maintains: - The instruction list being generated (append-only, SSA). - A type-indexed variable registry. - The pick_variable method for selecting type-correct variables.
The generator creates IR sequences that do the following: 1. Build an open_channel message with mostly arbitrary fields, except that the correct chain_hash is pulled from the context. 2. Send the open_channel message to the target. 3. Receive any accept_channel response from the target.
We have access to expected input and output types, so we can ensure the types match up during Program construction. Currently we panic on the first invalid Instruction since generators are expected to *always* generate type-correct programs, and we want to detect generator bugs early on in development. Once we add support for mutators that rewrite existing programs, we will need to handle the situation where a corpus input deserializes to an invalid Program, causing us to panic on rewrite. The simplest solution is to implement and use Program::validate() to check deserialized corpus inputs, refusing to mutate inputs that do not validate.
devvaansh
left a comment
There was a problem hiding this comment.
took a deep look at the IR Builder and Operation logic, have a few suggestions regarding determinism for reproducibility and some potential hot-path optimizations for fuzzing throughput.
| /// A generator that emits instructions into a `ProgramBuilder`. | ||
| pub trait Generator { | ||
| /// Emits instructions for this generator's protocol interaction. | ||
| fn generate(&self, builder: &mut ProgramBuilder, rng: &mut impl Rng); |
There was a problem hiding this comment.
nit: rng is only used inside ProgramBuilder and #5 mentions this, which sounds to me like a generator shouldn't be in control of what a builder will select:
Each generator [...] delegates value selection and variable reuse to
ProgramBuilder.
Should the generator then be in control of the RNG? Have you considered passing it in ProgramBuilder::new instead?
There was a problem hiding this comment.
I think it makes sense to have rng in the generate interface. Currently generators don't use it directly, but in the future they probably will. For example, an interactive-tx generator may randomly choose what transaction inputs and outputs to construct, or randomly arrange the tx_add_* and tx_remove_* messages to send.
|
Also LGTM👍 |
harsh04044
left a comment
There was a problem hiding this comment.
The separation of generate_fresh() for keys vs pick_variable() for scalar parameters is a clean design. For the interactive-tx generator, serial_ids have a similar constraint -- parity is protocol-mandated (even for initiator, odd for non-initiator), so they'd need to be generated fresh with explicit parity enforcement rather than picked from the candidate pool.
Is the intent that generators are responsible for enforcing protocol constraints like this, or is there a planned mechanism in ProgramBuilder to support constrained variable generation?
I would lean towards managing those kinds of constraints in the generators. If we have a recurring need for the same functionality we could consider moving it to |
| /// Selects or creates a variable of the given type using probabilistic | ||
| /// variable selection (75% most recent, 15% any existing, 10% fresh). | ||
| #[allow(clippy::missing_panics_doc)] // candidates is always non-empty | ||
| pub fn pick_variable(&mut self, var_type: VariableType, rng: &mut impl Rng) -> usize { |
There was a problem hiding this comment.
Nit: I think a TODO comment explaining that we'd like to replace these hardcoded probabilities with an adaptive mutation scheduler (like MOpt) in the future would make it easy to circle back here in the future.
Other than this, LGTM.
Adds the functionality needed to generate IR programs that exercise the
open_channel -> accept_channelflow.Key pieces:
ProgramBuilderis the toolkit for building type-correct programs.Generatoris the trait defining the generator interface.OpenChannelGeneratorgenerates IR programs that build and sendsopen_channelmessages, then wait for theaccept_channelreply.Ref: #5 (Milestone 1)