- Project Overview
- History and Genesis
- Design Motivation: The MEV Use Case
- System Architecture
- Core Components
- The Stack Machine to Register Machine Translation
- Compilation Pipeline
- Memory Model
- Control Flow and Jump Tables
- Runtime Function Architecture
- Symbol Management and Linking
- Code Organization
- File-by-File Summary
- Implementation Patterns
- Testing Strategy
- Known Limitations and Future Work
- Quick Reference
JET (JIT for EVM Transactions) is an LLVM-based JIT compiler for the Ethereum Virtual Machine. Instead of interpreting EVM bytecode instruction-by-instruction, JET compiles contracts to native machine code via LLVM, enabling significant performance improvements for compute-intensive operations.
The system compiles Ethereum Virtual Machine (EVM) bytecode into LLVM IR and then executes that IR using LLVM's JIT infrastructure (via inkwell). The compiler emits one LLVM function per contract. Execution runs that function with a pointer to a runtime Context, returning a ReturnCode.
- Performance: Native code execution vs. interpretation
- Optimization: LLVM's optimization passes (constant folding, dead code elimination, etc.)
- Portability: LLVM IR is architecture-independent; can target x86_64, ARM, etc.
- MEV Use Case: Ideal for scenarios where the same contract is executed thousands of times with warm data
- Language: Rust (edition 2024)
- LLVM Version: 21.0
- LLVM Bindings:
inkwellcrate - Target: ORC (On-Request Compilation) JIT
The project originated in 2020 at Ava Labs, where the initial concept was to build a native-machine smart contract platform that went beyond EVM optimization to rethink the execution substrate entirely.
After the internal project was discontinued due to organizational changes, the concept was reimplemented from scratch in Rust. This clean-room rewrite served multiple purposes: learning Rust, ensuring complete IP provenance clarity, and signaling a fresh implementation with no connection to prior internal work. The Rust implementation also proved well-suited to the problem domain, with explicit ownership semantics for JIT lifetimes and intentional use of unsafe code around executable memory.
The project was renamed to "Jet," a name that naturally captures the "EVM in a JIT" concept while suggesting speed and providing short, composable naming for components like JetBuilder (IR construction) and JetEngine (ORC instantiation and execution).
A key insight driving the project came from observing MEV (Maximal Extractable Value) operations. MEV searchers commonly instantiate a local EVM to simulate contract executions—for example, calculating Uniswap trade outcomes by crafting transactions that call relevant pool functions and executing them locally.
The standard objection to EVM JIT compilation—that I/O bottlenecks in state management dominate execution time—doesn't apply in this context. MEV searchers load all relevant state data into memory once, then execute the same functions thousands of times over in-memory data. This scenario is the ideal use case for JIT compilation: amortizing compilation costs over many executions with warm data.
This extends to a broader architectural pattern: contracts could be lowered directly to shared libraries that any program could link against. For instance, Uniswap utility contracts could be compiled to native code libraries, allowing direct programmatic access to their functionality without EVM overhead.
┌─────────────────────────────────────────────────────────────────────────┐
│ JET Architecture │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐ │
│ │ EVM Bytecode │───▶│ JetBuilder │───▶│ LLVM Module │ │
│ │ (contract) │ │ (compiler) │ │ (IR + declarations) │ │
│ └──────────────┘ └──────────────┘ └──────────┬───────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐ │
│ │ Result │◀───│ JetEngine │◀───│ ORC JIT Engine │ │
│ │ (ReturnCode) │ │ (executor) │ │ (native compilation) │ │
│ └──────────────┘ └──────────────┘ └──────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────┐ │
│ │ jet_runtime │ │
│ │ (builtins) │ │
│ └──────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────┘
- Compiler (
crates/jet): Parses bytecode, identifies basic blocks, and builds LLVM IR for each opcode. - Runtime (
crates/jet_runtime): Defines the execution context and provides builtin functions for stack, memory, and contract calls. Also generates runtime IR viaRuntimeBuilder. - Shared types (
crates/jet_ir): Unified LLVM type registry (jet_ir::Types) and constants shared by both compiler and runtime to prevent layout drift. - Push macros (
crates/jet_push_macros): Proc-macro crate generatingPUSH0..PUSH32bytecode helper macros.
jet/
├── crates/
│ ├── jet/ # Main compiler crate
│ │ ├── src/
│ │ │ ├── lib.rs # Module exports
│ │ │ ├── instructions.rs # EVM opcode definitions
│ │ │ ├── builder/ # IR construction
│ │ │ │ ├── mod.rs # Error types
│ │ │ │ ├── contract.rs # Core compilation logic
│ │ │ │ ├── env.rs # LLVM environment setup
│ │ │ │ ├── manager.rs # Build orchestration
│ │ │ │ └── ops.rs # Opcode implementations
│ │ │ └── engine/ # JIT execution
│ │ │ └── mod.rs # Engine wrapper
│ │ ├── bin/
│ │ │ └── jetdbg.rs # Debug/testing utility
│ │ └── tests/ # Integration tests
│ │
│ ├── jet_ir/ # Shared IR types and constants
│ │ └── src/
│ │ ├── lib.rs # Re-exports
│ │ ├── constants.rs # EVM + Jet runtime constants
│ │ └── types.rs # Unified LLVM type registry
│ │
│ ├── jet_push_macros/ # Proc-macro crate for PUSH opcodes
│ │ └── src/
│ │ └── lib.rs # generate_push_macros! proc-macro
│ │
│ └── jet_runtime/ # Runtime support crate
│ └── src/
│ ├── lib.rs # Re-exports (including jet_ir::*)
│ ├── address.rs # Address newtype ([u8; 20])
│ ├── exec.rs # Execution context
│ ├── builtins.rs # Extern "C" runtime functions
│ ├── runtime_builder.rs # Programmatic IR generation
│ ├── symbols.rs # Symbol name constants
│ └── binding/ # Display implementations
Jet employs compilation at two levels:
EVM to LLVM IR Phase: A mixture of eager and lazy compilation. The system can analyze contract execution frequency to determine compilation priorities. Contracts can be identified as frequently-executed by examining their deployment code—Solidity's optimizer makes size-versus-execution-frequency tradeoffs that signal expected usage patterns. Popular contracts can be pre-compiled during initialization.
IR to Native Machine Code Phase: ORC not only performs initial compilation but can actively analyze executing code and recompile with different optimizations. The database stores LLVM IR rather than machine code, making it portable across architectures—the same compiled IR can be moved between systems and will lower to the appropriate machine code at runtime.
Purpose: Define EVM opcodes and provide bytecode iteration.
Key Types:
enum Instruction {
STOP = 0x00,
ADD = 0x01,
// ... all 150+ EVM opcodes
}
enum IteratorItem {
Instr(usize, Instruction), // (pc, opcode)
PushData(usize, [u8; 32]), // (pc, data in little-endian)
Invalid(usize), // Invalid opcode at pc
}Critical Behavior: The iterator converts PUSH data from big-endian (EVM native) to little-endian (x86/ARM native) during parsing. This is important for subsequent LLVM operations.
Bytecode Decoding: instructions::Iterator walks raw bytecode and emits:
Instr(pc, Instruction)for standard opcodesPushData(pc, [u8; 32])for PUSH0..PUSH32, with bytes reversed to convert big-endian immediates into Jet's little-endian internal word
Purpose: Set up the LLVM compilation environment with types and symbols.
Key Structures:
struct Types<'ctx> {
i8, i32, i64, i160, i256, // Integer types
ptr, // Pointer type
word_bytes: [32 x i8], // 32-byte array
stack: [1024 x i256], // EVM stack
exec_ctx: struct, // Execution context
block_info: struct, // Block metadata
contract_fn: fn(ptr, ptr) -> i8, // Contract signature
}
struct Symbols<'ctx> {
jit_engine: GlobalValue,
stack_push_word, stack_push_ptr, stack_pop, stack_peek, stack_swap,
mem_store, mem_store_byte, mem_load,
contract_call, contract_call_return_data_copy,
keccak256,
}Purpose: The core compilation logic that transforms EVM bytecode into LLVM IR.
Key Types:
struct Registers<'ctx> {
exec_ctx: PointerValue, // Pointer to execution context
block_info: PointerValue, // Pointer to block info
jump_ptr: PointerValue, // Pointer to jump target
return_offset: PointerValue, // Return data offset
return_length: PointerValue, // Return data length
sub_call: PointerValue, // Sub-call context pointer
}
struct BuildCtx<'ctx, 'b> {
env: &Env,
builder: &Builder,
registers: Registers,
func: FunctionValue,
}
struct CodeBlock<'ctx, 'b> {
offset: usize, // Bytecode offset
rom: &[u8], // Bytecode slice
basic_block: BasicBlock, // LLVM basic block
is_jumpdest: bool, // Is a jump destination
terminates: bool, // Has terminator instruction
}Purpose: Implement each EVM opcode as LLVM IR generation.
Pattern: Each opcode function follows:
- Pop operands from stack (as pointers)
- Load values from pointers into SSA values
- Perform LLVM operation
- Push result back to stack
Purpose: Wrap the compilation and execution pipeline.
Key Methods:
impl Engine {
fn new(context, opts) -> Self; // Create with options
fn build_contract(addr, rom) -> Result; // Compile bytecode
fn run_contract(addr, block_info) -> ContractRun; // Execute
}Purpose: Runtime state for contract execution.
#[repr(C)]
struct Context {
stack_ptr: u32, // Stack depth (top-of-stack index)
jump_ptr: u32, // Dynamic jump target (temporary storage)
return_off: u32, // Return data offset (window in memory)
return_len: u32, // Return data length
sub_call: Option<Box<Context>>, // Nested call context (optional nested Context for CALL)
stack: [[u8; 32]; 1024], // The EVM stack (fixed array of 1024 EVM words)
memory: [u8; 32768], // EVM memory (linear memory buffer, initially 32KB)
memory_len: u32, // Used memory length
memory_cap: u32, // Memory capacity
}Important: This struct is passed by pointer into JIT-compiled contract functions. The compiler assumes a specific field order when performing struct GEPs (getelementptr operations).
Purpose: Carries chain data exposed to opcodes like BLOCKHASH.
Fields include:
- number, difficulty, gas_limit, timestamp
- base_fee, blob_base_fee, chain_id
- hash (current block hash), hash_history (last 256), coinbase
The compiler currently only uses block hash access; additional opcodes are stubbed.
Purpose: Encode execution outcomes.
Return codes encode execution outcomes:
- Negative values: Jet-level failures (e.g., InvalidJumpBlock = -1)
- 0..63: EVM-level success (ImplicitReturn = 0, ExplicitReturn = 1, Stop = 2)
- 64+: EVM-level failure (Revert = 64, Invalid = 65, JumpFailure = 66)
Compiled functions always return one of these values.
Purpose: Rust functions callable from compiled LLVM IR.
All functions use extern "C" ABI and are marked unsafe:
stack_push_ptr,stack_pop,stack_peek,stack_swapmem_store,mem_store_byte,mem_loadjet_contract_call,jet_contract_call_return_data_copyjet_ops_keccak256
These are declared in runtime IR and mapped at runtime using ExecutionEngine::add_global_mapping.
The EVM is a stack machine: operations implicitly pop operands from a stack and push results back. Example:
PUSH1 0x01 ; stack: [1]
PUSH1 0x02 ; stack: [1, 2]
ADD ; stack: [3]
LLVM IR is a register machine with SSA (Static Single Assignment): every value is assigned exactly once to a virtual register.
%a = i256 1
%b = i256 2
%c = add i256 %a, %bJET uses a real stack in the Context struct as the single source of truth. Every stack operation is a runtime function call:
pub fn add(bctx: &BuildCtx) -> Result<(), Error> {
let (a, b) = stack_pop_2(bctx)?; // Calls runtime `stack_pop`
let a = load_i256(bctx, a)?; // LLVM load from pointer
let b = load_i256(bctx, b)?;
let result = bctx.builder.build_int_add(a, b, "add_result")?;
call_stack_push_i256(bctx, result)?; // Calls runtime `stack_push`
Ok(())
}This preserves EVM stack semantics by keeping the canonical stack in runtime memory and operating on it via builtins. In LLVM IR:
- Stack values are handled as pointers to 32-byte words
- Arithmetic opcodes load i256 values from those pointers, compute in SSA, and then push the result back to the runtime stack
This avoids complex SSA stack simulation at the cost of runtime calls.
- Correctness First: The real stack ensures correct semantics even with complex control flow
Trade-offs:
- Pros: Simplifies opcode lowering; avoids complex SSA stack modeling
- Cons: Frequent runtime calls and memory traffic; more JIT overhead
// instructions.rs - Iterator yields parsed opcodes and data
for item in instructions::Iterator::new(bytecode) {
match item {
IteratorItem::Instr(pc, Instruction::ADD) => { /* handle ADD */ }
IteratorItem::PushData(pc, data) => { /* handle PUSH data */ }
IteratorItem::Invalid(pc) => { /* error */ }
}
}// contract.rs - find_code_blocks()
fn find_code_blocks(env, func, bytecode) -> CodeBlocks {
// Creates LLVM basic blocks using a single linear scan:
for item in instructions::Iterator::new(bytecode) {
match instr {
STOP | RETURN | REVERT | JUMP => {
// Terminates current block
current_block.set_terminates();
}
JUMPI => {
// Conditional terminator: ends block, creates new block for fall-through
current_block = blocks.add(pc + 1, create_bb());
}
JUMPDEST => {
// Ends previous block, starts new jump target block
// Marks the start of a jump target block
current_block = blocks.add(pc + 1, create_bb());
current_block.set_is_jumpdest();
}
}
}
}Each block captures the slice of ROM it covers and whether it terminates.
// contract.rs - build_contract_body()
fn build_contract_body(bctx, code_blocks) {
// Iterates discovered blocks and emits instructions via builder::ops
for code_block in code_blocks.iter() {
if code_block.is_jumpdest() {
jump_cases.push((offset, basic_block));
}
build_code_block(bctx, code_block, jump_block, following_block)?;
if !code_block.terminates() {
// Add implicit branch to next block
// Wires fallthrough to the next block when a block does not terminate
builder.build_unconditional_branch(next_block);
}
}
// Emits a shared jump-table block if any JUMPDEST exists
build_jump_table(bctx, jump_block, jump_cases);
}// contract.rs - build_jump_table()
fn build_jump_table(bctx, jump_block, jump_cases) {
// Create failure block for invalid jumps
builder.position_at_end(jump_failure_block);
builder.build_return(ReturnCode::JumpFailure);
// Build switch statement
builder.position_at_end(jump_block);
let jump_value = builder.build_load(jump_ptr);
builder.build_switch(jump_value, jump_failure_block, jump_cases);
}// engine/mod.rs
fn run_contract(&self, addr, block_info) -> ContractRun {
// Create JIT engine
let jit = module.create_jit_execution_engine(OptimizationLevel::None);
// Link runtime functions
self.link_in_runtime(&jit);
// Look up and call contract function
let contract_fn = jit.get_function(mangle_contract_fn(addr));
let ctx = Context::new();
let result = contract_fn.call(&ctx);
ContractRun::new(result, ctx)
}Gas accounting is designed but not yet implemented. The intended approach exploits LLVM's basic block structure: since a basic block either executes completely or not at all, gas costs can be amortized across the entire block. Many contracts have infrequent jumps, resulting in large basic blocks where gas accounting reduces to a single addition at the block's end. Instructions with dynamic gas costs require additional logic only when needed.
The Context struct uses pointer-based memory (ADR-002) — memory is heap-allocated and referenced by pointer, not stored inline. This matches EVM semantics (unbounded growth) and eliminates layout drift between Rust and generated IR.
┌─────────────────────────────────────────────────────────────────┐
│ Context (repr(C)) │
├─────────────────────────────────────────────────────────────────┤
│ stack_ptr: u32 │ Current stack depth (0-1023) │
│ jump_ptr: u32 │ Target offset for dynamic JUMP │
│ return_off: u32 │ Return data start offset in memory │
│ return_len: u32 │ Return data length in bytes │
│ sub_call: Option<Box<Context>> │ Nested call context │
├─────────────────────────────────────────────────────────────────┤
│ stack: [[u8; 32]; 1024] │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ Word 0 │ Word 1 │ Word 2 │ ... │ Word 1023 ││
│ │ [32 bytes each, little-endian] ││
│ └─────────────────────────────────────────────────────────────┘│
├─────────────────────────────────────────────────────────────────┤
│ memory_ptr: *mut u8 │ Pointer to heap-allocated memory buffer │
│ memory_len: u32 │ Used memory length │
│ memory_cap: u32 │ Allocated capacity │
└─────────────────────────────────────────────────────────────────┘
LLVM field indices (for GEP operations):
0: stack_ptr, 1: jump_ptr, 2: return_off, 3: return_len,
4: sub_call, 5: stack, 6: memory_ptr, 7: memory_len, 8: memory_cap
Memory is initially allocated as WORD_SIZE_BYTES * MEMORY_INITIAL_SIZE_WORDS bytes (32 KB) with 32-byte alignment, and freed in Context::drop. The jet_ir::Types struct defines an identical layout in LLVM IR so that generated code and Rust agree on every field offset.
- Size: 32 bytes (256 bits)
- Endianness: Little-endian storage (converted from EVM big-endian during parsing)
- LLVM Type:
i256for arithmetic,[32 x i8]for byte access
EVM immediates are big-endian, but Jet stores words as little-endian in memory. The byte iterator reverses PUSH data, and the BYTE opcode reverses its index (31 - idx) to match this internal representation.
Stack Pointer (stack_ptr) points to next free slot:
stack_ptr = 3
┌─────┬─────┬─────┬─────┬─────┬─────┐
│ A │ B │ C │ │ │ ... │
└─────┴─────┴─────┴─────┴─────┴─────┘
[0] [1] [2] [3]
↑
stack_ptr (next write position)
PUSH D: stack[3] = D; stack_ptr = 4
POP: stack_ptr = 3; return stack[2] (C)
DUP2: push(stack[stack_ptr - 2]) // Copy B to top
SWAP1: swap(stack[stack_ptr-1], stack[stack_ptr-2])
Static Jumps (JUMPI fall-through): The compiler knows both possible destinations at compile time.
Dynamic Jumps (JUMP, JUMPI taken branch): The target is a runtime value on the stack.
All dynamic jumps go through a central jump_block. Dynamic jumps are handled by a late jump block:
JUMP/JUMPIstore the target intoexec_ctx.jump_ptr- Control branches to the shared jump block
- The jump block switches on
jump_ptrto the target block - If no case matches, the function returns
ReturnCode::JumpFailure
This keeps target validation centralized and avoids indirect branches.
┌─────────────────┐
│ JUMP opcode │
│ 1. Pop target │
│ 2. Store to │
│ jump_ptr │
│ 3. Branch to │
│ jump_block │
└────────┬────────┘
│
▼
┌────────────────────────────────────────────────────────────┐
│ jump_block │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ %target = load i32, ptr %jump_ptr │ │
│ │ switch i32 %target, label %jump_failure [ │ │
│ │ i32 0x05, label %block_at_0x05 │ │
│ │ i32 0x10, label %block_at_0x10 │ │
│ │ i32 0x2A, label %block_at_0x2A │ │
│ │ ] │ │
│ └──────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ JUMPDEST@5 │ │ JUMPDEST@16 │ │ JUMPDEST@42 │
└─────────────┘ └─────────────┘ └─────────────┘
fn jumpi(bctx, jump_block, jump_else_block) {
let (pc, cond) = __stack_pop_2(bctx)?;
// Store target for potential jump
builder.build_store(registers.jump_ptr, pc);
// Compare condition to zero
let cmp = builder.build_int_compare(EQ, cond, zero, "jumpi_cmp");
// Branch: if cond == 0, fall through; else jump
builder.build_conditional_branch(cmp, jump_else_block, jump_block);
}PCis baked as a constant usingcode_block.offset + pcto yield the absolute bytecode indexJUMP/JUMPIuse the shared jump block as described above
Some operations are too complex for inline IR generation:
- Memory bounds checking
- Dynamic memory allocation
- Hash computation (keccak256)
- Cross-contract calls
The static runtime-ir/jet.ll file has been replaced by jet_runtime::RuntimeBuilder, a Rust struct that generates the runtime LLVM module programmatically using inkwell. This eliminates the host target triple mismatch that the hand-written .ll file suffered from and lets the runtime IR evolve alongside Rust types without keeping two representations in sync.
RuntimeBuilder::build() generates the following IR functions:
| Function | Kind | Description |
|---|---|---|
jet.stack.push.i256 |
IR-defined | Push i256 value onto stack |
jet.stack.push.ptr |
IR-defined | Push word from pointer onto stack |
jet.stack.pop |
IR-defined | Pop word pointer from stack (null on underflow) |
jet.stack.peek |
IR-defined | Peek at word at index without popping |
jet.stack.swap |
IR-defined | Swap top word with word at index |
jet.mem.load |
IR-defined | Load i256 from memory (returns value, not pointer) |
jet.mem.store.word |
IR-defined | Store 32-byte word to memory |
jet.mem.store.byte |
IR-defined | Store single byte to memory |
jet.contract.call |
Declared | Cross-contract call (implemented in builtins.rs) |
jet.ops.keccak256 |
Declared | Keccak256 hash (implemented in builtins.rs) |
jet.ops.exp |
Declared | Modular exponentiation (implemented in builtins.rs) |
jet.ops.addmod |
Declared | 512-bit ADDMOD (implemented in builtins.rs) |
jet.ops.mulmod |
Declared | 512-bit MULMOD (implemented in builtins.rs) |
jet.mem.expand |
Declared | Dynamic memory expansion (implemented in builtins.rs) |
IR-defined functions are compiled by LLVM and benefit from standard optimization passes. Declared functions are extern "C" Rust functions linked via add_global_mapping at JIT startup.
In symbols.rs:
pub const FN_STACK_POP: &str = "jet.stack.pop";
pub const FN_MEM_STORE_WORD: &str = "jet.mem.store.word";
pub const FN_CONTRACT_CALL: &str = "jet.contract.call";In builtins.rs:
pub unsafe extern "C" fn stack_pop(ctx: *mut Context) -> *const Word {
let ctx = unsafe { ctx.as_mut() }.unwrap();
ctx.stack_pop() as *const Word
}Linking at JIT time:
fn link_in_runtime(&self, ee: &ExecutionEngine) {
ee.add_global_mapping(&sym.stack_pop(), builtins::stack_pop as usize);
}CALL lowers to a runtime builtin that:
- Looks up the callee function pointer via the JIT engine
- Creates a new sub-context (
Context::init_sub_call) - Executes the callee JIT function
- Copies return data into the caller memory if requested
The call returns a small status code pushed onto the stack.
RETURNwrites return offset/length in the context, then returnsReturnCode::ExplicitReturnREVERTandINVALIDreturn their respective codes
| Type | Pattern | Example |
|---|---|---|
| Runtime functions | jet.{category}.{operation} |
jet.stack.push.i256 |
| Contract functions | jet.contracts.{address} |
jet.contracts.0x1234 |
| Globals | jet.{name} |
jet.jit_engine |
Runtime symbols are defined in jet_runtime::symbols.
pub fn mangle_contract_fn(address: &str) -> String {
format!("{}{}", FN_CONTRACT_PREFIX, address)
// "jet.contracts." + "0x1234" = "jet.contracts.0x1234"
}Contract symbols are mangled with jet.contracts. prefix and the address string. At execution time, jet_contract_fn_lookup reverses address bytes and reconstructs the mangled name to lookup the function pointer.
┌──────────────────────────────────────────────────────────────────────────┐
│ Cross-Contract Call │
├──────────────────────────────────────────────────────────────────────────┤
│ │
│ Contract A Contract B │
│ ┌─────────────────────┐ ┌─────────────────────┐ │
│ │ ... │ │ jet.contracts.0xB │ │
│ │ CALL to 0xB ──────────────┐ │ ┌─────────────────┐ │ │
│ │ │ │ │ │ Function body │ │ │
│ │ │ │ │ │ ... │ │ │
│ │ │ │ │ │ RETURN │ │ │
│ └─────────────────────┘ │ │ └─────────────────┘ │ │
│ │ └─────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ jet_contract_call() │ │
│ │ 1. Look up jet.contracts.0xB in JIT engine │ │
│ │ 2. Create sub-call Context │ │
│ │ 3. Execute contract B function │ │
│ │ 4. Copy return data to caller's memory │ │
│ │ 5. Return status code │ │
│ └───────────────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────────────┘
-
Define opcode in
instructions.rs(if not present):instructions! { // ... NEWOP = 0xNN, }
-
Implement operation in
ops.rs:pub(crate) fn newop(bctx: &BuildCtx<'_, '_>) -> Result<(), Error> { let a = __stack_pop_1(bctx)?; let a_val = load_i256(bctx, a)?; // ... perform operation ... __stack_push_int(bctx, result)?; Ok(()) }
-
Add dispatch in
contract.rs:Instruction::NEWOP => ops::newop(bctx),
See docs/process/new-runtime-function.md for the full checklist.
- Module structure declaration
- Enables
allocator_apifeature
- Macro-based EVM opcode enum definition (
instruction!macro) - Implements
TryFrom<u8>,Display,opcode()methods - Custom
Iteratorthat handles PUSH data bytes - Converts PUSH data from big-endian to little-endian (bytes reversed)
- Error enum for build failures
- Module declarations
Registers: Caches pointers into exec_ctx (jump_ptr, return_offset, return_length, sub_call)BuildCtx: Wraps Env, Builder, current function, and RegistersCodeBlock: Represents a basic block with offset, ROM slice, flagsCodeBlocks: Collection of CodeBlocks with helper methodsbuild(): Main entry point - creates function, discovers blocks, generates IRfind_code_blocks(): First pass - discovers basic block boundariesbuild_contract_body(): Second pass - generates IR for all blocksbuild_code_block(): Generates IR for a single blockbuild_jump_table(): Creates the switch statement for dynamic jumps
Options: Build configuration (mode Debug/Release, emit_llvm, assert)Types: All LLVM type definitions (i8/i32/i64/i160/i256, ptr, word_bytes, stack, mem, exec_ctx)Symbols: Runtime function lookups, mapped tojet_runtime::symbolsEnv: Wraps context, module, types, symbols
- Implementations for each EVM opcode
- Helper functions for stack operations (
stack_pop_1/2/3/7,stack_push_int,call_stack_push_i256) - Pattern: Pop inputs → Load values → LLVM operation → Push result
- Many opcodes return
Error::UnimplementedInstruction
Manager: Wraps Env and adds functions per contract address- Builds contract, optionally prints IR via syntect, and verifies
Engine: Wraps Manager, handles compilation and execution- Calls
RuntimeBuilder::build()to generate the runtime LLVM module - Creates JIT execution engine
- Links
extern "C"builtins at JIT time viaadd_global_mapping - Executes contracts and returns
ContractRun
- Canonical constants:
WORD_SIZE_BYTES,STACK_SIZE_WORDS,ADDRESS_SIZE_BYTES,MEMORY_INITIAL_SIZE_WORDS, etc. - Single source of truth for sizes shared by compiler and runtime
Types<'ctx>: Unified LLVM type registry built from an inkwellContext- Defines all primitive types (
i8,i32,i64,i160,i256,ptr) - Defines
exec_ctxstruct layout (9 fields, packed) — single authoritative definition - Defines
block_infostruct layout - Re-used by both
RuntimeBuilderand compiler'senv.rsto guarantee layout consistency
generate_push_macros!(0..=32)proc-macro- Generates
PUSH0!,PUSH1!(b), ...,PUSH32!(b0, b1, ...)bytecode helper macros - Each macro takes exactly N byte arguments and emits the correct opcode + data bytes
- Module declarations; re-exports
jet_ir::*(constants flow fromjet_ir) - Public surface:
Address,Result,RuntimeError,RuntimeBuilder
Address([u8; 20])newtype with#[repr(transparent)]- Derives
Clone,Copy,PartialEq,Eq,Hash,Default Display/Debugemit lowercase0x-prefixed hexFromStr/TryFrom<&str>parse hex strings with optional0xprefixFrom<[u8; 20]>,Into<[u8; 20]>,AsRef<[u8]>for zero-cost interop
Word: 32-byte array type alias ([u8; 32])Context: Execution context with pointer-based memory (ADR-002)memory_ptr: *mut u8— heap-allocated buffer, freed inDropmemory_len/memory_captrack usage and allocated capacity
BlockInfo: EVM block metadata structReturnCode: Enum for execution results (EVM and Jet-level success/failure)ContractRun: Wraps result and contextContractFunc: Function pointer type for compiled contracts
- Unsafe
extern "C"functions for complex operations that need Rust stdlib/deps - Contract calls, keccak256, EXP, ADDMOD, MULMOD, memory expansion
- These are declared in the runtime IR module and linked via
add_global_mapping
RuntimeBuilder: Generates the runtime LLVM module programmatically- Replaces the old static
runtime-ir/jet.llfile build()returns aModule<'ctx>containing all IR-defined runtime functions- Uses
jet_ir::Typesfor consistent struct layouts - IR-defined functions: all stack and basic memory operations
- Declared-only functions: contract calls, crypto, arithmetic ops
- String constants for all symbol names
- Used for consistent linking between Rust and LLVM
- Contract symbols prefixed with
"jet.contracts."
- Display implementations for debugging
// Binary operation pattern
pub(crate) fn binop(bctx: &BuildCtx<'_, '_>) -> Result<(), Error> {
// 1. Pop operands (returns pointers)
let (a, b) = __stack_pop_2(bctx)?;
// 2. Load values from pointers
let a = load_i256(bctx, a)?;
let b = load_i256(bctx, b)?;
// 3. Perform LLVM operation
let result = bctx.builder.build_int_xxx(a, b, "binop_result")?;
// 4. Push result
__stack_push_int(bctx, result)?;
Ok(())
}pub(crate) fn runtime_op(bctx: &BuildCtx<'_, '_>) -> Result<(), Error> {
let arg = __stack_pop_1(bctx)?;
bctx.builder.build_call(
bctx.env.symbols().runtime_function(),
&[bctx.registers.exec_ctx.into(), arg.into()],
"runtime_op_result",
)?;
Ok(())
}pub(crate) fn control_op(
bctx: &BuildCtx<'_, '_>,
target_block: BasicBlock,
) -> Result<(), Error> {
// Build branch
bctx.builder.build_unconditional_branch(target_block)?;
Ok(())
}The test framework uses a declarative macro. Tests under crates/jet/tests compile synthetic ROMs and assert on:
- Stack contents and pointer depth
- Jump pointer values
- Return offset/length
- Memory contents after MSTORE/MLOAD/RETURNDATACOPY
rom_tests! {
test_name: Test {
roms: vec![vec![
Instruction::PUSH1.opcode(), 0x01,
Instruction::PUSH1.opcode(), 0x02,
Instruction::ADD.opcode(),
]],
expected: TestContractRun {
stack_ptr: 1,
stack: vec![stack_word(&[0x03])],
..Default::default()
},
},
}These tests serve as executable specs for the subset of opcodes currently implemented.
- Arithmetic: ADD, MUL, SUB, DIV, MOD
- Control Flow: JUMP, JUMPI, PC
- Memory: MLOAD, MSTORE, MSTORE8
- Contract Calls: CALL, RETURNDATASIZE, RETURNDATACOPY
- Cryptographic: KECCAK256
cargo test -p jetSeveral opcode families are stubbed:
- Storage: SLOAD, SSTORE, TLOAD, TSTORE
- Environment: ADDRESS, BALANCE, CALLER, CALLVALUE, ORIGIN
- Call data: CALLDATALOAD, CALLDATASIZE, CALLDATACOPY
- Block Info: COINBASE, TIMESTAMP, NUMBER, DIFFICULTY, etc.
- Logging: LOG0-LOG4
- Creation: CREATE, CREATE2
- Delegate Calls: DELEGATECALL, STATICCALL, CALLCODE
- Gas accounting: Not implemented. The intended approach amortizes cost per basic block.
- Code eviction: No memory management for compiled contracts; the JIT cache grows unbounded.
- Stack overflow checking:
stack_popreturns null on underflow (handled);stack_pushdoes not yet check for overflow at depth 1024. - Memory bounds checking:
jet.mem.expandis declared but bounds validation in memory read/write paths may be incomplete.
Previously resolved limitations (no longer issues):
Runtime IR target triple mismatch— eliminated whenruntime-ir/jet.llwas replaced byRuntimeBuilderStruct layout mismatches between Rust and LLVM IR— resolved byjet_ir::Typesas the single source of truth (ADR-002)Symbol naming inconsistency (— resolved injet.stack.push.wordvs.i256)RuntimeBuilder— corrected to 20 inADDRESS_SIZE_BYTES = 2in testsjet_ir::constants
-
Runtime stack as source of truth
- Pros: Simplifies opcode lowering; avoids complex SSA stack modeling
- Cons: Frequent runtime calls and memory traffic; more JIT overhead
-
Jump table for dynamic jumps
- Pros: Validates jump targets centrally; uses LLVM switch for clarity
- Cons: Adds an extra block and indirect branch on every JUMP/JUMPI
-
IR stub module for runtime symbols
- Pros: Keeps symbol discovery centralized; allows IR helpers like
jet.stack.push.i256to be optimized by LLVM - Cons: Requires careful alignment between Rust structs and LLVM types
- Pros: Keeps symbol discovery centralized; allows IR helpers like
-
Minimal opcode subset
- Pros: Enables rapid iteration on compiler correctness
- Cons: Many opcodes are currently unimplemented
- Inline more builtins: Convert remaining Rust builtins (e.g., ADDMOD/MULMOD) to IR-defined functions in
RuntimeBuilderfor better LLVM optimization. - Gas amortization: Compute gas per basic block, not per instruction.
- Stack overflow checking: Add
stack_ptr >= 1024guard tostack_pushfunctions. - Profile-guided optimization: Use ORC's profiling for hot path optimization.
- Shared library extraction: Compile contracts to standalone
.so/.dllfiles. - Expand opcode coverage: With a test-first approach (storage, environment, call data, logs).
- Add stack overflow guard in
RuntimeBuilder::build_stack_push_* - Implement full memory bounds checking in memory read/write paths
- Expand opcode coverage with a test-first approach
| Task | Primary Files |
|---|---|
| Add new opcode | instructions.rs, ops.rs, contract.rs |
| Add IR-defined runtime function | runtime_builder.rs, symbols.rs, env.rs |
| Add extern "C" runtime function | builtins.rs, runtime_builder.rs (declare), symbols.rs, engine/mod.rs (link) |
| Modify execution context layout | exec.rs, jet_ir/types.rs (must stay in sync) |
| Modify shared constants | jet_ir/constants.rs |
| Debug compilation | jetdbg.rs, enable emit_llvm option |
| Add tests | tests/test_roms.rs, tests/roms/mod.rs |
| Type | Size | Purpose |
|---|---|---|
Word |
32 bytes | EVM stack word |
i256 |
256 bits | LLVM integer for arithmetic |
Context |
~33KB | Execution state |
ReturnCode |
1 byte | Execution result |
# Build everything
make build
# Run debug tool
cargo run --bin jetdbg
# Run tests
cargo test -p jet
# Build with LLVM output
cargo run --bin jetdbg -- build --emit-llvm| Term | Definition |
|---|---|
| Basic Block | A sequence of instructions with one entry point and one exit point |
| SSA | Static Single Assignment - each variable assigned exactly once |
| ORC | On-Request Compilation - LLVM's modern JIT framework |
| JUMPDEST | EVM opcode marking valid jump destinations |
| Word | 256-bit (32-byte) value, the fundamental unit in EVM |
| ROM | Read-only memory containing bytecode |
| GEP | GetElementPtr - LLVM instruction for computing addresses |