UPLC-CAPE

Comparative Artifact Performance Evaluation for UPLC programs

A framework for measuring and comparing UPLC programs generated by different Cardano smart contract compilers.

Overview
Quick Start
Live Performance Reports
Available benchmark scenarios
Usage (CLI)
- Core commands
Creating a Submission
Metrics Explained
Project Structure
Resources
Version and Tooling Requirements
Development
Documentation (ADRs)
Contributing
License
Acknowledgments

Overview

UPLC-CAPE provides a structured, reproducible way for Cardano UPLC compilers authors and users to:

Benchmark compiler UPLC output against standardized scenarios
Compare results across compilers and versions
Track optimization progress over time
Share results with the community

Key properties:

Consistent benchmarks and metrics (CPU units, memory units, script size, term size)
Reproducible results with versioned scenarios and metadata
Automation-ready structure for future tooling

Quick Start

Prerequisites

Nix with flakes enabled
Git

Setup

# Clone and enter repository
git clone https://github.com/IntersectMBO/UPLC-CAPE.git
cd UPLC-CAPE

# Enter development environment
nix develop
# Or, if using direnv (recommended)
direnv allow

# Verify CLI
scripts/cape.sh --help
# Or use the cape shim if available in PATH
cape --help

Your first benchmark

# List available benchmarks
cape benchmark list

# View a specific benchmark
cape benchmark fibonacci
cape benchmark two_party_escrow

# Generate JSON statistics for all benchmarks
cape benchmark stats

# Create a submission for your compiler
cape submission new fibonacci MyCompiler 1.0.0 myhandle
cape submission new two_party_escrow MyCompiler 1.0.0 myhandle

Live Performance Reports

Latest benchmark reports: UPLC-CAPE Reports

PR Preview Sites

Pull requests that modify submission data automatically get isolated preview sites for review:

Preview URL pattern: https://intersectmbo.github.io/UPLC-CAPE/pr-<number>/
Example: PR #42 → https://intersectmbo.github.io/UPLC-CAPE/pr-42/
Trigger conditions: Previews only generate when .uplc or metadata.json files change in the submissions/ directory
Automatic updates: Preview refreshes on every push to the PR branch
Automatic cleanup: Preview is removed when the PR is closed or merged
Comment notification: A sticky comment appears on the PR with the direct preview link

Note: PRs that only modify documentation, README files, or code outside submissions/ will not trigger preview generation.

For implementation details, see ADR: PR Preview Deployment.

Available benchmark scenarios

Benchmark	Type	Description	Status
Fibonacci	Synthetic	Recursive algorithm performance	Ready
Fibonacci (Naive Recursion)	Synthetic	Prescribed naive recursive algorithm for compiler optimization comparison	Ready
Factorial	Synthetic	Recursive algorithm performance	Ready
Factorial (Naive Recursion)	Synthetic	Prescribed naive recursive algorithm for compiler optimization comparison	Ready
Two-Party Escrow	Real-world	Smart contract escrow validator	Ready
Linear Vesting	Real-world	Time-based token vesting validator	Ready
HTLC	Real-world	Hashed time-locked contract validator	Ready
Streaming Payments	Real-world	Payment channel implementation	Planned
Simple DAO Voting	Real-world	Governance mechanism	Planned
Time-locked Staking	Real-world	Staking protocol	Planned

Usage (CLI)

For the full and up-to-date command reference, see USAGE.md.

Core commands

# Benchmarks
cape benchmark list              # List all benchmarks
cape benchmark <name>            # Show benchmark details
cape benchmark stats             # Generate JSON statistics for all benchmarks
cape benchmark new <name>        # Create a new benchmark from template

# Submissions
cape submission list             # List all submissions
cape submission list <name>      # List submissions for a benchmark
cape submission new <benchmark> <compiler> <version> <handle>
cape submission verify           # Verify correctness and validate schemas
cape submission measure          # Measure UPLC performance
cape submission aggregate        # Generate CSV performance report
cape submission report <name>    # Generate HTML report for a benchmark
cape submission report --all     # Generate HTML reports for all benchmarks

JSON Statistics

The cape benchmark stats command generates comprehensive JSON data for all benchmarks:

# Output JSON statistics to console
cape benchmark stats

# Save to file
cape benchmark stats > stats.json

# Use with jq for filtering
cape benchmark stats | jq '.benchmarks[] | select(.submission_count > 0)'

The output includes formatted metrics, best value indicators, and submission metadata, making it ideal for generating custom reports or integrating with external tools.

Creating a Submission

Choose a benchmark

cape benchmark list
cape benchmark fibonacci

Create submission structure

cape submission new fibonacci MyCompiler 1.0.0 myhandle
# → submissions/fibonacci/MyCompiler_1.0.0_myhandle/

Add your UPLC program
- Replace the placeholder UPLC with your fully-applied program (no parameters).
- Path:
  - submissions/fibonacci/MyCompiler_1.0.0_myhandle/fibonacci.uplc
- The program should compute the scenario's required result deterministically within budget.

Provide metadata

Create metadata.json according to submissions/TEMPLATE/metadata.schema.json (see also metadata-template.json).

{
  "compiler": {
    "name": "MyCompiler",
    "version": "1.0.0",
    "commit_hash": "a1b2c3d4e5f6789012345678901234567890abcd"
  },
  "compilation_config": {
    "target": "uplc",
    "optimization_level": "O2",
    "flags": ["--inline-functions", "--optimize-recursion"]
  },
  "contributors": [
    {
      "name": "myhandle",
      "organization": "MyOrganization",
      "contact": "myhandle@example.com"
    }
  ],
  "submission": {
    "date": "2025-01-15T00:00:00Z",
    "source_available": true,
    "source_repository": "https://github.com/myorg/mycompiler-submissions",
    "source_commit_hash": "9876543210fedcba9876543210fedcba98765432",
    "implementation_notes": "Optimized recursive implementation using memoization. See source/ directory for full code and build instructions."
  }
}

For reproducibility, include:

compiler.commit_hash: Exact compiler version used
submission.source_repository and submission.source_commit_hash: Link to source code with exact commit

Verify and measure

Use the unified verification command to ensure your submission is correct and schema-compliant, then measure performance.
- Verify correctness and JSON schemas (all submissions or a path):
```
cape submission verify submissions/fibonacci/MyCompiler_1.0.0_myhandle
# or, verify everything
cape submission verify --all
```
- Measure and write metrics.json automatically:
  - Measure all .uplc files under a path (e.g., your submission directory):
```
cape submission measure submissions/fibonacci/MyCompiler_1.0.0_myhandle
# or, from inside the submission directory
cape submission measure .
```
  - Measure every submission under submissions/:
```
cape submission measure --all
```
- What verification does:
  - Evaluates your UPLC program; if it reduces to BuiltinUnit, correctness passes
  - Otherwise, runs the comprehensive test suite defined in scenarios/{benchmark}/cape-tests.json
  - Validates your metrics.json and metadata.json against schemas
- What measure does automatically:
  - Measures CPU units, memory units, script size, and term size for your .uplc file(s)
  - Generates or updates a metrics.json with scenario, measurements, evaluator, and timestamp
  - Keeps your existing notes and version if present; otherwise fills sensible defaults
  - Works for a single file, a directory, or all submissions with --all
  - Produces output that validates against submissions/TEMPLATE/metrics.schema.json
- Aggregation Strategies: The measure tool now runs multiple test cases per program and provides several aggregation methods for CPU and memory metrics:
  - maximum: Peak resource usage across all test cases (useful for identifying worst-case performance)
  - sum: Total computational work across all test cases (useful for overall efficiency comparison)
  - minimum: Best-case resource usage (useful for identifying optimal performance)
  - median: Typical resource usage (useful for understanding normal performance)
  - sum_positive: Total resources for successful test cases only (valid execution cost)
  - sum_negative: Total resources for failed test cases only (error handling cost)
  Higher-level tooling can extract the most relevant aggregation for specific analysis needs.
- Resulting file example:
```
{
  "scenario": "fibonacci",
  "version": "1.0.0",
  "measurements": {
    "cpu_units": {
      "maximum": 185916,
      "sum": 185916,
      "minimum": 185916,
      "median": 185916,
      "sum_positive": 185916,
      "sum_negative": 0
    },
    "memory_units": {
      "maximum": 592,
      "sum": 592,
      "minimum": 592,
      "median": 592,
      "sum_positive": 592,
      "sum_negative": 0
    },
    "script_size_bytes": 1234,
    "term_size": 45
  },
  "evaluations": [
    {
      "name": "fibonacci_25_computation",
      "description": "Pre-applied fibonacci(25) should return 75025",
      "cpu_units": 185916,
      "memory_units": 592,
      "execution_result": "success"
    }
  ],
  "execution_environment": {
    "evaluator": "plutus-core-executable-1.52.0.0"
  },
  "timestamp": "2025-01-15T00:00:00Z",
  "notes": "Optional notes."
}
```
Document
- Add notes to README.md inside your submission folder (implementation choices, optimizations, caveats).

Metrics Explained

UPLC-CAPE collects both raw measurements (CPU, memory, script size, term size) and derived metrics (fees, budget utilization, capacity).

Quick Reference:

Metric	Description	Type
CPU Units	Computational cost (CEK machine steps)	Raw measurement
Memory Units	Memory consumption (CEK machine memory)	Raw measurement
Script Size	Serialized UPLC size (bytes)	Raw measurement
Term Size	AST complexity (node count)	Raw measurement
Execution Fee	Runtime cost in lovelace	Derived (Conway)
Reference Script Fee	Storage cost in lovelace (tiered)	Derived (Conway)
Total Fee	Combined execution + storage cost	Derived (Conway)
Budget Utilization	% of tx/block budgets consumed	Derived (Conway)
Capacity (tx/block)	Max script executions per tx/block	Derived (Conway)

📖 For comprehensive metrics documentation, see doc/metrics.md

This includes detailed formulas, protocol parameters, aggregation strategies, and interpretation guidelines.

Project Structure

UPLC-CAPE/
├── scenarios/                    # Benchmark specifications
│   ├── TEMPLATE/                 # Template for new scenarios
│   ├── fibonacci.md
│   ├── factorial.md
│   └── two_party_escrow.md
├── submissions/                  # Compiler submissions (per scenario)
│   ├── TEMPLATE/                 # Templates and schemas
│   │   ├── metadata.schema.json
│   │   ├── metadata-template.json
│   │   ├── metrics.schema.json
│   │   └── metrics-template.json
│   ├── fibonacci/
│   │   └── MyCompiler_1.0.0_handle/
│   └── two_party_escrow/
│       └── MyCompiler_1.0.0_handle/
├── scripts/                      # Project CLI tooling
│   ├── cape.sh                   # Main CLI
│   └── cape-subcommands/         # Command implementations
├── lib/                          # Haskell library code (validators, fixtures, utilities)
├── measure-app/                  # UPLC program measurement tool
├── plinth-submissions-app/       # Plinth submission generator
├── test/                         # Test suites
├── report/                       # Generated HTML reports and assets
├── doc/                          # Documentation
│   ├── domain-model.md
│   └── adr/
└── README.md

Resources

Version and Tooling Requirements

Development environment: Nix shell (nix develop) with optional direnv (direnv allow).
GHC: 9.6.7 (provided in Nix shell).
Plutus Core target: 1.1.0.
- Use plcVersion110 (for Haskell/PlutusTx code).
Package baselines (CHaP):
- plutus-core >= 1.45.0.0
- plutus-tx >= 1.45.0.0
- plutus-ledger-api >= 1.45.0.0
- plutus-tx-plugin >= 1.45.0.0

Development

Enter environment:

nix develop
# or
direnv allow

Common tools:

cape … (project CLI)
cabal build (builds all Haskell components: library, executables, tests)
treefmt (format all files, including UPLC)
fourmolu (Haskell formatting)
pretty-uplc (UPLC pretty-printing)
adr (Architecture Decision Records)
mmdc -i file.mmd (diagram generation, if available)

UPLC Formatting

UPLC files can be pretty-printed for improved readability:

# Format a single UPLC file in place
pretty-uplc submissions/fibonacci/MyCompiler_1.0.0_handle/fibonacci.uplc

# Format all UPLC files (and other files) via treefmt
treefmt

The treefmt command automatically formats all file types including UPLC files (.uplc). The pretty-uplc executable is built from the in-repo cape cabal project (see pretty-uplc-app/Main.hs); it parses each file with UntypedPlutusCore.Parser and re-renders it with PlutusCore.Pretty.prettyPlcClassic, matching the canonical format produced by Cape.WritePlc. It is shipped on PATH by the nix development shell.

Documentation (ADRs)

ADRs document important design decisions (managed with Log4brains).

Helpful commands:

adr new "Decision Title"
adr preview
adr build
adr help

Contributing

We welcome contributions from compiler authors, benchmark designers, and researchers.

Add a new benchmark:

cape benchmark new my-new-benchmark
# edit scenarios/my-new-benchmark.md

Add a submission:

cape submission new existing-benchmark MyCompiler 1.0.0 myhandle
# fill uplc and json files, then open a PR

Please read CONTRIBUTING.md before opening a PR.

License

Licensed under the Apache License 2.0. See LICENSE.

Acknowledgments

Plutus Core team for infrastructure and reference implementations
Compiler authors and community contributors

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPLC-CAPE

Table of Contents

Overview

Quick Start

Prerequisites

Setup

Your first benchmark

Live Performance Reports

PR Preview Sites

Available benchmark scenarios

Usage (CLI)

Core commands

JSON Statistics

Creating a Submission

Metrics Explained

Project Structure

Resources

Version and Tooling Requirements

Development

UPLC Formatting

Documentation (ADRs)

Contributing

License

Acknowledgments

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

UPLC-CAPE

Table of Contents

Overview

Quick Start

Prerequisites

Setup

Your first benchmark

Live Performance Reports

PR Preview Sites

Available benchmark scenarios

Usage (CLI)

Core commands

JSON Statistics

Creating a Submission

Metrics Explained

Project Structure

Resources

Version and Tooling Requirements

Development

UPLC Formatting

Documentation (ADRs)

Contributing

License

Acknowledgments