This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
The project uses Pixi for package management and task automation. Key commands:
pixi run ci- Run complete CI pipeline (format, lint, test with coverage)pixi run test- Run pytest test suitepixi run format- Format code with ruffpixi run lint- Run ruff linting and pylintpixi run coverage- Run tests with coverage reportingpixi run generate-docs- Generate documentation from examplespixi run demo- Run demo example (example_image.py)
pixi run agent-iterate- Full CI cycle for AI agents (includes docs, tests, commits, and fixes)
To test a specific example: pixi run python bencher/example/example_simple_float.py
- ALWAYS use the pixi environment for every command. Never run raw
python,pytest,ruff, or any other tool directly — always prefix withpixi run(e.g.pixi run python ...,pixi run pytest ...). This ensures the correct dependencies and environment are used.
See docs/how_to_use_bencher.md for the complete guide
on using bencher — sweep types, result types, the benchmark() pattern, plot callbacks,
and common mistakes. Read this before writing any benchmark or example.
- Add or modify the example implementation under
bencher/example/. - Register the example in
bencher/example/meta/generate_examples.pyso the documentation generator emits a notebook (pick an appropriate gallery subdirectory). - Run
pixi run generate-docsto regenerate gallery notebooks. - Update relevant user docs (for instance
docs/intro.mdor gallery sections) to mention the new example. - Make sure conf.py includes the docs that are added
- Execute
pixi run cibefore committing to ensure formatting, linting, and tests all pass.
Bencher is a benchmarking framework built around these core concepts:
- Bench: Main benchmarking class that orchestrates parameter sweeps and result collection
- BenchRunner: Higher-level interface for managing multiple benchmark runs
- BenchCfg/BenchRunCfg: Configuration classes for benchmark setup and execution
- ParametrizedSweep: Base class for defining parameter sweep configurations
- Uses the
paramlibrary for parameter definitions with metadata - Sweep Classes:
IntSweep,FloatSweep,StringSweep,EnumSweep,BoolSweep - Parameters define search spaces with bounds and sampling strategies
- Results stored in N-dimensional xarray structures
- BenchResult: Container for benchmark results and visualizations
- HoloviewResult: Base class for interactive plots (scatter, line, heatmap, etc.)
- ComposableContainer: Framework for combining multiple result types
- Video/Image Results: Support for multimedia outputs
- Results automatically cached using diskcache based on parameter hashes
- Result type selection: Use
ResultBoolfor binary outcomes (success/failure),ResultFloatfor continuous metrics,ResultStringfor text,ResultImage/ResultVideo/ResultPathfor files
- Define parameter sweep configuration class inheriting from ParametrizedSweep
- Implement benchmark function that takes config instance, returns metrics dict
- Bench calculates the Cartesian product of all parameter combinations
- Each combination executed (with caching), results stored in N-D tensor
- Automatic plot type deduction based on parameter/result types
- Results cached persistently for reuse
bencher/- Main package sourcebencher/example/- Comprehensive examples organized by input dimensionsbencher/variables/- Parameter sweep and result variable definitionsbencher/results/- Result containers and visualization classestest/- Test suite
pyproject.toml- Project dependencies and Pixi task definitionsruff.toml- Code formatting/linting configuration (100 char line length)- Line length limit: 100 characters (configured in ruff.toml)
- Uses pytest framework
- Coverage reporting with coverage.py
- Examples serve as integration tests
- Meta-generated examples in
bencher/example/meta/
All auto-generated examples live under bencher/example/generated/. Each filename must
be globally unique across the entire generated tree — no two files may share a basename
even if they are in different subdirectories. This is required because the documentation
build uses filenames as RST page stems and thumbnail identifiers.
Pattern: {section_prefix}{descriptive_dimensions}.py
Every filename uses a section prefix from the table below (each already includes
the example_ prefix), followed by the varying dimensions encoded in the name:
| Section Prefix | Output Directory | Varying Dimensions |
|---|---|---|
example_sweep_ |
{N}_float/{variant}/ |
float count, cat count, variant |
example_plot_ |
plot_types/ |
plot type |
example_bool_plot_ |
bool_plot_types/ |
plot type |
example_result_ |
result_types/result_{type}/ |
result type, input dims |
example_composable_ |
composable_containers/ |
backend, compose type |
example_sampling_ |
sampling/ |
strategy |
example_stats_ |
statistics/ |
variant |
example_const_vars_ |
const_vars/ |
example |
example_optim_ |
optimization*/ |
objectives, dims, over_time |
example_advanced_ |
advanced/ |
example |
example_workflow_ |
workflows/ |
example |
example_perf_ |
performance/ |
variant |
example_regression_ |
regression/ |
variant |
example_yaml_ |
yaml/ |
format |
example_publish_ |
publishing/ |
example |
example_rerun_ |
rerun/ |
example |
example_agg_ |
aggregation/ |
aggregation form, agg_fn |
example_cartesian_ |
cartesian_animation/ |
(single example) |
example_container_tab_ |
container_tabs/ |
layout mode |
example_levels_ |
levels/ |
variant |
Rules for adding new generators:
- Every filename must start with
example_followed by a unique section prefix. - Encode every varying dimension in the filename — never rely on the folder path alone.
- The Python function inside the generated file must also start with
example_(the test harness and doc builder use this prefix for discovery). - Register the generator in
generate_examples.py:generate_python_files()and add corresponding entries toSECTION_GROUPSfor gallery placement.