Comparative Artifact Performance Evaluation for UPLC programs
A framework for measuring and comparing UPLC programs generated by different Cardano smart contract compilers.
- Overview
- Quick Start
- Live Performance Reports
- Available benchmark scenarios
- Usage (CLI)
- Creating a Submission
- Metrics Explained
- Project Structure
- Resources
- Version and Tooling Requirements
- Development
- Documentation (ADRs)
- Contributing
- License
- Acknowledgments
UPLC-CAPE provides a structured, reproducible way for Cardano UPLC compilers authors and users to:
- Benchmark compiler UPLC output against standardized scenarios
- Compare results across compilers and versions
- Track optimization progress over time
- Share results with the community
Key properties:
- Consistent benchmarks and metrics (CPU units, memory units, script size, term size)
- Reproducible results with versioned scenarios and metadata
- Automation-ready structure for future tooling
- Nix with flakes enabled
- Git
# Clone and enter repository
git clone https://github.com/IntersectMBO/UPLC-CAPE.git
cd UPLC-CAPE
# Enter development environment
nix develop
# Or, if using direnv (recommended)
direnv allow
# Verify CLI
scripts/cape.sh --help
# Or use the cape shim if available in PATH
cape --help# List available benchmarks
cape benchmark list
# View a specific benchmark
cape benchmark fibonacci
cape benchmark two_party_escrow
# Generate JSON statistics for all benchmarks
cape benchmark stats
# Create a submission for your compiler
cape submission new fibonacci MyCompiler 1.0.0 myhandle
cape submission new two_party_escrow MyCompiler 1.0.0 myhandleLatest benchmark reports: UPLC-CAPE Reports
Pull requests that modify submission data automatically get isolated preview sites for review:
- Preview URL pattern:
https://intersectmbo.github.io/UPLC-CAPE/pr-<number>/ - Example: PR #42 →
https://intersectmbo.github.io/UPLC-CAPE/pr-42/ - Trigger conditions: Previews only generate when
.uplcormetadata.jsonfiles change in thesubmissions/directory - Automatic updates: Preview refreshes on every push to the PR branch
- Automatic cleanup: Preview is removed when the PR is closed or merged
- Comment notification: A sticky comment appears on the PR with the direct preview link
Note: PRs that only modify documentation, README files, or code outside submissions/ will not trigger preview generation.
For implementation details, see ADR: PR Preview Deployment.
| Benchmark | Type | Description | Status |
|---|---|---|---|
| Fibonacci | Synthetic | Recursive algorithm performance | Ready |
| Fibonacci (Naive Recursion) | Synthetic | Prescribed naive recursive algorithm for compiler optimization comparison | Ready |
| Factorial | Synthetic | Recursive algorithm performance | Ready |
| Factorial (Naive Recursion) | Synthetic | Prescribed naive recursive algorithm for compiler optimization comparison | Ready |
| Two-Party Escrow | Real-world | Smart contract escrow validator | Ready |
| Linear Vesting | Real-world | Time-based token vesting validator | Ready |
| HTLC | Real-world | Hashed time-locked contract validator | Ready |
| Streaming Payments | Real-world | Payment channel implementation | Planned |
| Simple DAO Voting | Real-world | Governance mechanism | Planned |
| Time-locked Staking | Real-world | Staking protocol | Planned |
For the full and up-to-date command reference, see USAGE.md.
# Benchmarks
cape benchmark list # List all benchmarks
cape benchmark <name> # Show benchmark details
cape benchmark stats # Generate JSON statistics for all benchmarks
cape benchmark new <name> # Create a new benchmark from template
# Submissions
cape submission list # List all submissions
cape submission list <name> # List submissions for a benchmark
cape submission new <benchmark> <compiler> <version> <handle>
cape submission verify # Verify correctness and validate schemas
cape submission measure # Measure UPLC performance
cape submission aggregate # Generate CSV performance report
cape submission report <name> # Generate HTML report for a benchmark
cape submission report --all # Generate HTML reports for all benchmarksThe cape benchmark stats command generates comprehensive JSON data for all benchmarks:
# Output JSON statistics to console
cape benchmark stats
# Save to file
cape benchmark stats > stats.json
# Use with jq for filtering
cape benchmark stats | jq '.benchmarks[] | select(.submission_count > 0)'The output includes formatted metrics, best value indicators, and submission metadata, making it ideal for generating custom reports or integrating with external tools.
-
Choose a benchmark
cape benchmark list cape benchmark fibonacci
-
Create submission structure
cape submission new fibonacci MyCompiler 1.0.0 myhandle # → submissions/fibonacci/MyCompiler_1.0.0_myhandle/ -
Add your UPLC program
- Replace the placeholder UPLC with your fully-applied program (no parameters).
- Path:
- submissions/fibonacci/MyCompiler_1.0.0_myhandle/fibonacci.uplc
- The program should compute the scenario's required result deterministically within budget.
-
Provide metadata
Create
metadata.jsonaccording tosubmissions/TEMPLATE/metadata.schema.json(see alsometadata-template.json).{ "compiler": { "name": "MyCompiler", "version": "1.0.0", "commit_hash": "a1b2c3d4e5f6789012345678901234567890abcd" }, "compilation_config": { "target": "uplc", "optimization_level": "O2", "flags": ["--inline-functions", "--optimize-recursion"] }, "contributors": [ { "name": "myhandle", "organization": "MyOrganization", "contact": "myhandle@example.com" } ], "submission": { "date": "2025-01-15T00:00:00Z", "source_available": true, "source_repository": "https://github.com/myorg/mycompiler-submissions", "source_commit_hash": "9876543210fedcba9876543210fedcba98765432", "implementation_notes": "Optimized recursive implementation using memoization. See source/ directory for full code and build instructions." } }For reproducibility, include:
compiler.commit_hash: Exact compiler version usedsubmission.source_repositoryandsubmission.source_commit_hash: Link to source code with exact commit
-
Verify and measure
Use the unified verification command to ensure your submission is correct and schema-compliant, then measure performance.
-
Verify correctness and JSON schemas (all submissions or a path):
cape submission verify submissions/fibonacci/MyCompiler_1.0.0_myhandle # or, verify everything cape submission verify --all -
Measure and write metrics.json automatically:
-
Measure all .uplc files under a path (e.g., your submission directory):
cape submission measure submissions/fibonacci/MyCompiler_1.0.0_myhandle # or, from inside the submission directory cape submission measure .
-
Measure every submission under submissions/:
cape submission measure --all
-
-
What verification does:
- Evaluates your UPLC program; if it reduces to BuiltinUnit, correctness passes
- Otherwise, runs the comprehensive test suite defined in
scenarios/{benchmark}/cape-tests.json - Validates your
metrics.jsonandmetadata.jsonagainst schemas
-
What measure does automatically:
- Measures CPU units, memory units, script size, and term size for your .uplc file(s)
- Generates or updates a
metrics.jsonwith scenario, measurements, evaluator, and timestamp - Keeps your existing
notesandversionif present; otherwise fills sensible defaults - Works for a single file, a directory, or all submissions with
--all - Produces output that validates against
submissions/TEMPLATE/metrics.schema.json
-
Aggregation Strategies: The
measuretool now runs multiple test cases per program and provides several aggregation methods for CPU and memory metrics:maximum: Peak resource usage across all test cases (useful for identifying worst-case performance)sum: Total computational work across all test cases (useful for overall efficiency comparison)minimum: Best-case resource usage (useful for identifying optimal performance)median: Typical resource usage (useful for understanding normal performance)sum_positive: Total resources for successful test cases only (valid execution cost)sum_negative: Total resources for failed test cases only (error handling cost)
Higher-level tooling can extract the most relevant aggregation for specific analysis needs.
-
Resulting file example:
{ "scenario": "fibonacci", "version": "1.0.0", "measurements": { "cpu_units": { "maximum": 185916, "sum": 185916, "minimum": 185916, "median": 185916, "sum_positive": 185916, "sum_negative": 0 }, "memory_units": { "maximum": 592, "sum": 592, "minimum": 592, "median": 592, "sum_positive": 592, "sum_negative": 0 }, "script_size_bytes": 1234, "term_size": 45 }, "evaluations": [ { "name": "fibonacci_25_computation", "description": "Pre-applied fibonacci(25) should return 75025", "cpu_units": 185916, "memory_units": 592, "execution_result": "success" } ], "execution_environment": { "evaluator": "plutus-core-executable-1.52.0.0" }, "timestamp": "2025-01-15T00:00:00Z", "notes": "Optional notes." }
-
-
Document
- Add notes to README.md inside your submission folder (implementation choices, optimizations, caveats).
UPLC-CAPE collects both raw measurements (CPU, memory, script size, term size) and derived metrics (fees, budget utilization, capacity).
Quick Reference:
| Metric | Description | Type |
|---|---|---|
| CPU Units | Computational cost (CEK machine steps) | Raw measurement |
| Memory Units | Memory consumption (CEK machine memory) | Raw measurement |
| Script Size | Serialized UPLC size (bytes) | Raw measurement |
| Term Size | AST complexity (node count) | Raw measurement |
| Execution Fee | Runtime cost in lovelace | Derived (Conway) |
| Reference Script Fee | Storage cost in lovelace (tiered) | Derived (Conway) |
| Total Fee | Combined execution + storage cost | Derived (Conway) |
| Budget Utilization | % of tx/block budgets consumed | Derived (Conway) |
| Capacity (tx/block) | Max script executions per tx/block | Derived (Conway) |
📖 For comprehensive metrics documentation, see doc/metrics.md
This includes detailed formulas, protocol parameters, aggregation strategies, and interpretation guidelines.
UPLC-CAPE/
├── scenarios/ # Benchmark specifications
│ ├── TEMPLATE/ # Template for new scenarios
│ ├── fibonacci.md
│ ├── factorial.md
│ └── two_party_escrow.md
├── submissions/ # Compiler submissions (per scenario)
│ ├── TEMPLATE/ # Templates and schemas
│ │ ├── metadata.schema.json
│ │ ├── metadata-template.json
│ │ ├── metrics.schema.json
│ │ └── metrics-template.json
│ ├── fibonacci/
│ │ └── MyCompiler_1.0.0_handle/
│ └── two_party_escrow/
│ └── MyCompiler_1.0.0_handle/
├── scripts/ # Project CLI tooling
│ ├── cape.sh # Main CLI
│ └── cape-subcommands/ # Command implementations
├── lib/ # Haskell library code (validators, fixtures, utilities)
├── measure-app/ # UPLC program measurement tool
├── plinth-submissions-app/ # Plinth submission generator
├── test/ # Test suites
├── report/ # Generated HTML reports and assets
├── doc/ # Documentation
│ ├── domain-model.md
│ └── adr/
└── README.md
- Development environment: Nix shell (
nix develop) with optional direnv (direnv allow). - GHC: 9.6.7 (provided in Nix shell).
- Plutus Core target: 1.1.0.
- Use
plcVersion110(for Haskell/PlutusTx code).
- Use
- Package baselines (CHaP):
- plutus-core >= 1.45.0.0
- plutus-tx >= 1.45.0.0
- plutus-ledger-api >= 1.45.0.0
- plutus-tx-plugin >= 1.45.0.0
Enter environment:
nix develop
# or
direnv allowCommon tools:
- cape … (project CLI)
- cabal build (builds all Haskell components: library, executables, tests)
- treefmt (format all files, including UPLC)
- fourmolu (Haskell formatting)
- pretty-uplc (UPLC pretty-printing)
- adr (Architecture Decision Records)
- mmdc -i file.mmd (diagram generation, if available)
UPLC files can be pretty-printed for improved readability:
# Format a single UPLC file in place
pretty-uplc submissions/fibonacci/MyCompiler_1.0.0_handle/fibonacci.uplc
# Format all UPLC files (and other files) via treefmt
treefmtThe treefmt command automatically formats all file types including UPLC files (.uplc). The pretty-uplc executable is built from the in-repo cape cabal project (see pretty-uplc-app/Main.hs); it parses each file with UntypedPlutusCore.Parser and re-renders it with PlutusCore.Pretty.prettyPlcClassic, matching the canonical format produced by Cape.WritePlc. It is shipped on PATH by the nix development shell.
ADRs document important design decisions (managed with Log4brains).
Helpful commands:
adr new "Decision Title"
adr preview
adr build
adr helpWe welcome contributions from compiler authors, benchmark designers, and researchers.
-
Add a new benchmark:
cape benchmark new my-new-benchmark # edit scenarios/my-new-benchmark.md -
Add a submission:
cape submission new existing-benchmark MyCompiler 1.0.0 myhandle # fill uplc and json files, then open a PR
Please read CONTRIBUTING.md before opening a PR.
Licensed under the Apache License 2.0. See LICENSE.
- Plutus Core team for infrastructure and reference implementations
- Compiler authors and community contributors
