Skip to content

smite-ir: implement OpenChannelGenerator#18

Open
morehouse wants to merge 5 commits intomasterfrom
generators
Open

smite-ir: implement OpenChannelGenerator#18
morehouse wants to merge 5 commits intomasterfrom
generators

Conversation

@morehouse
Copy link
Copy Markdown
Owner

Adds the functionality needed to generate IR programs that exercise the open_channel -> accept_channel flow.

Key pieces:

  • ProgramBuilder is the toolkit for building type-correct programs.
  • Generator is the trait defining the generator interface.
  • OpenChannelGenerator generates IR programs that build and sends open_channel messages, then wait for the accept_channel reply.

Ref: #5 (Milestone 1)

Provides a list of all fields extractable from a compound variable,
along with the Operations required to extract them.

The immediate use case is for extracting fields from AcceptChannel
compound variables when generating programs for the open_channel ->
accept_channel flow.  Eventually we will also use this for other
compound variables.
ProgramBuilder is the toolkit for building Programs and is intended to
be used by generators and mutators.  It maintains:

- The instruction list being generated (append-only, SSA).
- A type-indexed variable registry.
- The pick_variable method for selecting type-correct variables.
The generator creates IR sequences that do the following:

1. Build an open_channel message with mostly arbitrary fields, except
   that the correct chain_hash is pulled from the context.
2. Send the open_channel message to the target.
3. Receive any accept_channel response from the target.
We have access to expected input and output types, so we can ensure the
types match up during Program construction.

Currently we panic on the first invalid Instruction since generators are
expected to *always* generate type-correct programs, and we want to
detect generator bugs early on in development.

Once we add support for mutators that rewrite existing programs, we will
need to handle the situation where a corpus input deserializes to an
invalid Program, causing us to panic on rewrite.  The simplest solution
is to implement and use Program::validate() to check deserialized corpus
inputs, refusing to mutate inputs that do not validate.
Copy link
Copy Markdown
Contributor

@devvaansh devvaansh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

took a deep look at the IR Builder and Operation logic, have a few suggestions regarding determinism for reproducibility and some potential hot-path optimizations for fuzzing throughput.

Copy link
Copy Markdown
Contributor

@ekzyis ekzyis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, just one question

/// A generator that emits instructions into a `ProgramBuilder`.
pub trait Generator {
/// Emits instructions for this generator's protocol interaction.
fn generate(&self, builder: &mut ProgramBuilder, rng: &mut impl Rng);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: rng is only used inside ProgramBuilder and #5 mentions this, which sounds to me like a generator shouldn't be in control of what a builder will select:

Each generator [...] delegates value selection and variable reuse to ProgramBuilder.

Should the generator then be in control of the RNG? Have you considered passing it in ProgramBuilder::new instead?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to have rng in the generate interface. Currently generators don't use it directly, but in the future they probably will. For example, an interactive-tx generator may randomly choose what transaction inputs and outputs to construct, or randomly arrange the tx_add_* and tx_remove_* messages to send.

Copy link
Copy Markdown
Contributor

@erickcestari erickcestari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@devvaansh
Copy link
Copy Markdown
Contributor

Also LGTM👍

Copy link
Copy Markdown

@harsh04044 harsh04044 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The separation of generate_fresh() for keys vs pick_variable() for scalar parameters is a clean design. For the interactive-tx generator, serial_ids have a similar constraint -- parity is protocol-mandated (even for initiator, odd for non-initiator), so they'd need to be generated fresh with explicit parity enforcement rather than picked from the candidate pool.

Is the intent that generators are responsible for enforcing protocol constraints like this, or is there a planned mechanism in ProgramBuilder to support constrained variable generation?

@morehouse
Copy link
Copy Markdown
Owner Author

The separation of generate_fresh() for keys vs pick_variable() for scalar parameters is a clean design. For the interactive-tx generator, serial_ids have a similar constraint -- parity is protocol-mandated (even for initiator, odd for non-initiator), so they'd need to be generated fresh with explicit parity enforcement rather than picked from the candidate pool.

Is the intent that generators are responsible for enforcing protocol constraints like this, or is there a planned mechanism in ProgramBuilder to support constrained variable generation?

I would lean towards managing those kinds of constraints in the generators. If we have a recurring need for the same functionality we could consider moving it to ProgramBuilder.

/// Selects or creates a variable of the given type using probabilistic
/// variable selection (75% most recent, 15% any existing, 10% fresh).
#[allow(clippy::missing_panics_doc)] // candidates is always non-empty
pub fn pick_variable(&mut self, var_type: VariableType, rng: &mut impl Rng) -> usize {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think a TODO comment explaining that we'd like to replace these hardcoded probabilities with an adaptive mutation scheduler (like MOpt) in the future would make it easy to circle back here in the future.

Other than this, LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants