Skip to content

smite-scenarios: implement IR executor#16

Open
morehouse wants to merge 3 commits intomasterfrom
executor
Open

smite-scenarios: implement IR executor#16
morehouse wants to merge 3 commits intomasterfrom
executor

Conversation

@morehouse
Copy link
Copy Markdown
Owner

@morehouse morehouse commented Mar 31, 2026

Adds the functionality needed to execute IR programs.

The executor walks a Program one Instruction at a time, executing the specified actions saving the output of each instruction in an SSA store for use by later instructions. If any type mismatches or invalid variable references are detected, errors are returned.

Ref: #5 (Milestone 1)

Copy link
Copy Markdown
Contributor

@NishantBansal2003 NishantBansal2003 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

One thing I want to discuss: do we really need to respond to the target node's ping? AFAIK, LN nodes have a >30s window before declaring a node offline, and since fuzzing runs for ~1s, I'm not sure whether the ping timeout state is restored with the snapshot. If it is, do we need to send a pong at all(given that we already responded during the init phase), or are we doing it just to be extra safe?

@morehouse
Copy link
Copy Markdown
Owner Author

One thing I want to discuss: do we really need to respond to the target node's ping? AFAIK, LN nodes have a >30s window before declaring a node offline, and since fuzzing runs for ~1s, I'm not sure whether the ping timeout state is restored with the snapshot. If it is, do we need to send a pong at all(given that we already responded during the init phase), or are we doing it just to be extra safe?

It's a good question. I'm not sure how long typical programs will take to execute, but I can certainly imagine some pretty long Program sequences that may take ~30s to execute (e.g., interactive-tx flows).

At the very least we need to receive and discard pings, since we never know when a peer will send them. But if we are reading them anyways, why not just send the correct pong to ensure the connection stays open?

@NishantBansal2003
Copy link
Copy Markdown
Contributor

Fine with keeping it for extra safety, since it helps avoid worst-case scenarios

I'm not sure how long typical programs will take to execute, but I can certainly imagine some pretty long Program sequences that may take ~30s to execute (e.g., interactive-tx flows).

Really? So you mean for those scenarios we would use
~/AFLplusplus/afl-fuzz -X -i /tmp/smite-seeds -o /tmp/smite-out -t 30000 -- /tmp/smite-nyx
and end up with ~1/30 execs/sec?

At the very least we need to receive and discard pings, since we never know when a peer will send them. But if we are reading them anyways, why not just send the correct pong to ensure the connection stays open?

Yeah, as an extra safety mechanism I totally agree with using it, but logically I thought it wasn't required since my initial understanding was that fuzzing typically runs quite fast

@morehouse
Copy link
Copy Markdown
Owner Author

Really? So you mean for those scenarios we would use ~/AFLplusplus/afl-fuzz -X -i /tmp/smite-seeds -o /tmp/smite-out -t 30000 -- /tmp/smite-nyx and end up with ~1/30 execs/sec?

For certain inputs, maybe. It's hard to predict at this point, but things like mining blocks and waiting for the target to respond to messages can be slow. And it's conceivable the fuzzer could generate very large programs.

Eclair is currently one of the slowest targets, doing <100 execs/sec for the encrypted_bytes scenario, which mostly generates invalid messages. I imagine once we start sending and receiving multiple valid messages, execs/sec will plummet further.

Copy link
Copy Markdown

@harsh04044 harsh04044 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading through the executor code and had a question about recv_non_ping. It returns an error if the message type doesn't match, which causes the executor to stop. For interactive-tx, either peer can send tx_abort at any point during negotiation -- should that be treated as a graceful termination (ScenarioResult::Ok) rather than a crash indicator? Or is the intent that the generator always produces sequences where tx_abort isn't expected, so receiving one unexpectedly is genuinely a skip?

fn resolve_chain_hash(
variables: &[Option<Variable>],
index: usize,
) -> Result<[u8; 32], ExecuteError> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to why we're using [u8; 32] here, but &[u8; 32] in resolve_private_key?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No good reason. I've updated it to use [u8; 32] both places.

Operation::LoadChannelId(id) => Some(Variable::ChannelId(ChannelId::new(*id))),
Operation::LoadTargetPubkeyFromContext => {
let pk = PublicKey::from_slice(&program.context.target_pubkey)
.map_err(|_| ExecuteError::InvalidPublicKey)?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test for this if you want to include it:

diff --git a/smite-scenarios/src/executor.rs b/smite-scenarios/src/executor.rs
index 8da3f53..0d3f845 100644
--- a/smite-scenarios/src/executor.rs
+++ b/smite-scenarios/src/executor.rs
@@ -935,6 +935,23 @@ mod tests {
         assert!(matches!(err, ExecuteError::InvalidPrivateKey));
     }

+    #[test]
+    fn execute_invalid_target_public_key() {
+        let instrs = vec![Instruction {
+            operation: Operation::LoadTargetPubkeyFromContext,
+            inputs: vec![],
+        }];
+        let mut context = sample_context();
+        context.target_pubkey = [0u8; 33];
+        let program = Program {
+            instructions: instrs,
+            context,
+        };
+        let mut conn = MockConnection::new();
+        let err = execute(&program, &mut conn).unwrap_err();
+        assert!(matches!(err, ExecuteError::InvalidPublicKey));
+    }
+
     // -- extract_field tests --

     // TODO: Once we can actually construct and send accept_channel messages, it

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added it, thanks.

Comment on lines +919 to +928
let instrs = vec![
Instruction {
operation: Operation::LoadPrivateKey([0; 32]),
inputs: vec![],
},
Instruction {
operation: Operation::DerivePoint,
inputs: vec![0],
},
];
Copy link
Copy Markdown
Contributor

@ekzyis ekzyis Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The comment about LoadPrivateKey suggests it checks for invalid private keys, while it's actually DerivePoint that's doing that

/// Load a secp256k1 private key.  The executor validates that the bytes are
/// in range `[1, curve_order)` and skips if not.
LoadPrivateKey([u8; 32]),

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the operator comments in 000b3eb.

Copy link
Copy Markdown
Contributor

@erickcestari erickcestari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! The code is indeed very readable and easy to reason about. I have one observation and one small request for improving the error type. Overall, LGTM!

Comment on lines +149 to +152
Operation::RecvAcceptChannel => {
let ac = recv_accept_channel(conn)?;
Some(Variable::AcceptChannel(ac))
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

obs: The recv_ operations will cause the fuzzer to hang for five seconds until it times out and crashes if the expected message is not received by then. This could result in a very slow fuzzing process. However, we will only be able to tweak the TIMEOUT value once we start running and using our IR.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we'll have to wait and see how this works in actual fuzzing.

Comment on lines +166 to +174
fn resolve(variables: &[Option<Variable>], index: usize) -> Result<&Variable, ExecuteError> {
variables
.get(index)
.and_then(Option::as_ref)
.ok_or(ExecuteError::VariableIndexOutOfBounds {
index,
len: variables.len(),
})
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may should have an error for referencing None variables. Instead of resulting in an OOB error for an inbound index, the execute_void_variable_reference() test could use this new type.

Suggested change
fn resolve(variables: &[Option<Variable>], index: usize) -> Result<&Variable, ExecuteError> {
variables
.get(index)
.and_then(Option::as_ref)
.ok_or(ExecuteError::VariableIndexOutOfBounds {
index,
len: variables.len(),
})
}
fn resolve(variables: &[Option<Variable>], index: usize) -> Result<&Variable, ExecuteError> {
match variables.get(index) {
None => Err(ExecuteError::VariableIndexOutOfBounds {
index,
len: variables.len(),
}),
Some(None) => Err(ExecuteError::VoidVariableReference { index }),
Some(Some(v)) => Ok(v),
}
}

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a new VoidVariable error type.

@morehouse
Copy link
Copy Markdown
Owner Author

For interactive-tx, either peer can send tx_abort at any point during negotiation -- should that be treated as a graceful termination (ScenarioResult::Ok) rather than a crash indicator?

Yes, in general it's OK for the peer to send unexpected messages. We can't continue executing our program in that case, since we won't be able to do any subsequent Extract* operations that refer to the message we never received. But we shouldn't report an error.

Or is the intent that the generator always produces sequences where tx_abort isn't expected, so receiving one unexpectedly is genuinely a skip?

It might make sense to implement a RecvTxAbort operation so programs can start an interactive-tx flow, expect it to be aborted, and then do something else. But generally unexpected messages still should not cause an error to be reported, unless there is some specific situation where the unexpected message is a clear indicator that the spec is being violated or some other bug happened on the target.

Executes IR programs against a target, storing the resulting output of
each instruction in an SSA store.  Later variable references are
resolved against this store, and any type or argument mismatches are
reported as errors.

The current IR and executor are capable of building and sending
open_channel messages, receiving the accept_channel response, and
extracting fields from the received accept_channel.
Introduce a Connection trait over NoiseConnection, so we can test the
executor with a MockConnection.
The executor actually validates private keys when executing DerivePoint,
not LoadPrivateKey.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants