Skip to content

Latest commit

 

History

History
324 lines (255 loc) · 14.6 KB

File metadata and controls

324 lines (255 loc) · 14.6 KB

Quick Reference — All Topics

Single-page cheat sheet. Print this or keep it open in a terminal.


Source to CPU — The Big Picture

┌─────────────────────────────────────────────────────────────────────────────┐
│                        COMPILATION PIPELINE                                  │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  SOURCE        FRONTEND              MIDDLE            BACKEND     RUNTIME  │
│  ──────        ────────              ──────            ───────     ───────  │
│                                                                             │
│  hello.rs  →  Lexer → Parser → AST → HIR → MIR → LLVM IR → ASM → ELF → CPU │
│                                                                             │
│  "fn main"    tokens   tree    tree  typed  CFG    SSA      x86   binary    │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Essential Commands

# Compile Rust
rustc hello.rs                          # Basic compile
rustc --emit=llvm-ir hello.rs           # Output LLVM IR (.ll file)
rustc --emit=asm hello.rs               # Output assembly (.s file)

# Static binary (no runtime deps)
rustc --target x86_64-unknown-linux-musl hello.rs -o hello-static

# Check dependencies
ldd ./hello                             # Dynamic deps
file ./hello                            # Binary info

# Inspect binary
objdump -d ./hello | head -100          # Disassemble
readelf -h ./hello                       # ELF header
xxd ./hello | head -20                   # Raw hex

Key Concepts

Static vs Dynamic Linking

┌─────────────────────────────────────────────────────────────────┐
│ DYNAMIC (glibc)                 │ STATIC (musl)                 │
├─────────────────────────────────┼───────────────────────────────┤
│ Binary: ~500KB                  │ Binary: ~2MB                  │
│ Needs: libc.so at runtime       │ Needs: Nothing                │
│ Deploy: Need matching libs      │ Deploy: Just copy binary      │
│ ldd output: shows deps          │ ldd output: "statically linked"│
└─────────────────────────────────┴───────────────────────────────┘

CPU Registers (x86-64)

┌────────────────────────────────────────────────────────────────┐
│ General Purpose:                                                │
│   rax - Return value / accumulator                              │
│   rdi - 1st argument                                            │
│   rsi - 2nd argument                                            │
│   rdx - 3rd argument                                            │
│   rcx - 4th argument                                            │
│   r8  - 5th argument                                            │
│   r9  - 6th argument                                            │
│                                                                 │
│ Special:                                                        │
│   rsp - Stack pointer                                           │
│   rbp - Base pointer (frame)                                    │
│   rip - Instruction pointer                                     │
└────────────────────────────────────────────────────────────────┘

Assembly Syntax (Intel)

mov  rax, rbx        # rax = rbx (copy register)
mov  rax, [rbx]      # rax = *rbx (load from memory) ← DEREFERENCE
mov  [rax], rbx      # *rax = rbx (store to memory)
mov  rax, 42         # rax = 42 (immediate value)

add  rax, rbx        # rax = rax + rbx
sub  rax, rbx        # rax = rax - rbx
call function        # push rip; jump to function
ret                  # pop rip; return

Key insight: [brackets] = dereference = follow the pointer


LLVM IR Essentials

; Function that adds two numbers
define i32 @add(i32 %a, i32 %b) {
entry:
    %result = add i32 %a, %b     ; result = a + b
    ret i32 %result              ; return result
}

; Pointer dereference (*ptr in Rust)
%value = load i32, ptr %pointer  ; value = *pointer

SSA (Static Single Assignment): Every variable assigned exactly once.


ELF Binary Structure

┌────────────────────────────────────────────┐
│              ELF Header                     │
│  (Magic: 7F 45 4C 46 = .ELF)               │
├────────────────────────────────────────────┤
│           Program Headers                   │
│  (How to load into memory)                  │
├────────────────────────────────────────────┤
│              .text                          │
│  (Executable code)                          │
├────────────────────────────────────────────┤
│              .rodata                        │
│  (Read-only data: strings, constants)       │
├────────────────────────────────────────────┤
│              .data                          │
│  (Initialized global variables)             │
├────────────────────────────────────────────┤
│              .bss                           │
│  (Uninitialized globals, zeroed at start)   │
├────────────────────────────────────────────┤
│           Section Headers                   │
│  (For linker/debugger)                      │
└────────────────────────────────────────────┘

CPU Execution Cycle

┌─────────────────────────────────────────────────────────────────┐
│                    FETCH - DECODE - EXECUTE                      │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  1. FETCH                                                        │
│     ├─ Read instruction at address in RIP                        │
│     └─ Increment RIP                                             │
│                                                                  │
│  2. DECODE                                                       │
│     ├─ Parse opcode (48 8B 07 → mov rax, [rdi])                 │
│     └─ Identify operands                                         │
│                                                                  │
│  3. EXECUTE                                                      │
│     ├─ Perform operation                                         │
│     └─ Write result to register/memory                           │
│                                                                  │
│  4. REPEAT                                                       │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Decoding Machine Code

48 8B 07 = mov rax, [rdi]

48    = REX prefix (64-bit operand)
8B    = MOV opcode (r64, r/m64)
07    = ModR/M byte
        ├─ Mod=00 (memory, no displacement)
        ├─ Reg=000 (rax)
        └─ R/M=111 (rdi)

Rust vs Go vs C

┌────────────────────────────────────────────────────────────────┐
│ Language │ Compiler      │ Backend  │ Default Linking          │
├──────────┼───────────────┼──────────┼──────────────────────────┤
│ Rust     │ rustc + LLVM  │ LLVM     │ Dynamic (glibc)          │
│ Go       │ gc (custom)   │ Own      │ Static (built-in)        │
│ C        │ gcc/clang     │ Various  │ Dynamic (glibc)          │
└────────────────────────────────────────────────────────────────┘

Blockchain Fundamentals

Hash Properties

┌────────────────────────────────────────────────────────────────┐
│ Property              │ Meaning                                │
├───────────────────────┼────────────────────────────────────────┤
│ Deterministic         │ Same input → same output always        │
│ Avalanche effect      │ 1 bit change → completely new hash     │
│ One-way               │ Can't reverse hash → input             │
│ Collision resistant   │ Can't find two inputs with same hash   │
│ Fixed output          │ Always 256 bits (64 hex chars)         │
└────────────────────────────────────────────────────────────────┘

Block Structure

┌──────────────────────────────┐
│ index:      block number     │
│ timestamp:  creation time    │
│ data:       transactions     │
│ prev_hash:  link to previous │
│ merkle:     tx commitment    │
│ nonce:      PoW solution     │
│ hash:       SHA-256 of above │
└──────────────────────────────┘

Merkle Tree

        Root = H(H(AB) + H(CD))
       /                        \
  H(AB) = H(H(A)+H(B))    H(CD) = H(H(C)+H(D))
   /        \                /        \
 H(A)      H(B)           H(C)      H(D)
  │          │              │          │
 Tx A      Tx B           Tx C      Tx D

Proof for Tx B: need H(A), H(CD) → 2 hashes instead of 4 transactions
Scales: O(log N) for N transactions

PoW vs PoS

┌────────────────────────────────────────────────────────────────┐
│                │ Proof of Work      │ Proof of Stake           │
├────────────────┼────────────────────┼──────────────────────────┤
│ Security from  │ Computation cost   │ Economic stake           │
│ Energy use     │ Very high          │ Very low                 │
│ Attack cost    │ 51% hash power     │ 51% staked value         │
│ Used by        │ Bitcoin            │ Ethereum                 │
└────────────────────────────────────────────────────────────────┘

UTXO Model

No balances — only unspent transaction outputs:

  Alice has: [10-coin UTXO]
  Alice pays Bob 7:
    Input:  10-coin UTXO (consumed)
    Output: 7 coins → Bob (new UTXO)
    Output: 3 coins → Alice (change, new UTXO)

Like paying with a $10 bill → get $3 change

Common Gotchas

  1. "statically linked" but still has deps? → Check for dlopen() calls that load libs at runtime

  2. Binary works on your machine, fails elsewhere? → glibc version mismatch, use musl for portability

  3. LLVM IR looks different each time? → Variable names are auto-generated, focus on structure

  4. Assembly uses AT&T syntax?objdump -M intel for Intel syntax

  5. echo "hello" | sha256sum gives wrong hash? → Use echo -n to avoid hashing the newline character


Quick Debugging

# What's this binary?
file ./program

# What does it need?
ldd ./program

# What functions does it have?
nm ./program | grep ' T '

# Disassemble main
objdump -d ./program | grep -A 50 '<main>'

# System calls it makes
strace ./program 2>&1 | head

Mental Models

Source Code = Recipe          Transaction = Check
Compiler    = Chef            Block       = Ledger page
Assembly    = Detailed steps  Chain       = Bound ledger
Binary      = Pre-made meal   Mining      = Notarization
CPU         = The mouth       Consensus   = Auditor agreement