chuck is a computational utility that solves 10 high-value data and algorithm tasks.
Benchmarks and regression checks are included to measure how fast each capability is, and where the bottleneck occurs.
For a quick setup, see docs/INSTALLATION.md.
- built as a toolkit for 10 important compute/data tasks
- main focus is solving workloads fast
- prefers throughput and latency, and uses probabilistic methods where the speed gain is worth the trade-off from reliability
- keeps generators deterministic so outputs stay testable and snapshot/regression comparisons remain meaningful
| Capability | Type | Notes |
|---|---|---|
io_pipeline |
deterministic | streaming ingest/transform/aggregate |
ordering_core |
deterministic | large-scale sorting and merge paths |
retrieval_core |
probabilistic | sampled indexing for faster query estimates |
data_encoding |
deterministic | compression and round-trip integrity |
graph_analytics |
deterministic | iterative graph scoring workload |
prime_analytics |
probabilistic | Miller–Rabin based prime discovery |
memory_tier |
deterministic | cache/tier behavior simulation |
memory_index |
probabilistic | Bloom-filter style membership checks |
compute_core |
deterministic | dense numeric compute kernels |
relational_fusion |
deterministic | join + aggregation fusion |
Each task has a short design note in docs/capabilities/ with simple algorithm details.
Python is the default runtime. Native acceleration is organized per task with C++-first structure:
- each task owns its native entry at
chuck/tasks/<task>/native_cpp/binding.cpp - Python fallback dispatch:
chuck/native_bindings.py
Runtime behavior:
- if per-task C++ modules in
native/cpp/build/(for examplechuck_cpp_prime_analytics) are present,chuckcan use them - otherwise it automatically falls back to Python solvers
Build the native modules with the helper script shown in docs/INSTALLATION.md: python scripts/setup_native.py.
If you use Windows, use WSL2 and follow the Linux helper path.
For native build details and local verification, see docs/NATIVE_BINDINGS.md.
python -m chuck generate-baselines
python -m chuck regress
python -m chuck bench
python -m chuck bench --task prime_analyticsDetailed hands-on usage (task-by-task run, Python vs C++ compare, snapshots):
Some tasks expose confidence and are marked probabilistic.
This allows faster average-case execution while still reporting expected quality.
Snapshot comparisons continue to report both:
- speed changes
- reliability score changes
So trade-offs stay explicit and measurable.
python -m chuck snapshot --label old --backend python
python -m chuck snapshot --label current --backend cpp
python -m chuck compare --old data/reports/snapshots/old.json --new data/reports/snapshots/current.json
python -m chuck verify-snapshot --snapshot data/reports/snapshots/current.jsonsnapshot and compare --run-current support --backend {auto,python,cpp}.
| Path | Role |
|---|---|
chuck/tasks/ |
task generators + solvers |
chuck/benchmarks/ |
per-task benchmark entrypoints |
chuck/native_bindings.py |
Python/native backend dispatch |
chuck/comparison.py |
snapshot + A/B comparison + verification |
data/<capability>/regression.json |
per-task baseline snapshots |
docs/capabilities/ |
brief docs for each task |
chuck/tasks/<task>/native_cpp/binding.cpp |
task-owned C++ native entrypoint |
If you want to change code or docs, see CONTRIBUTING.md.
- https://staff.science.uva.nl/r.dehaan/complexity2021/files/lecture10.pdf
- https://www.cs.upc.edu/~mjserna/docencia/gm-aic/2021/12-AiC-Proba.pdf
- https://medium.com/pythoneers/randomized-algorithms-and-probabilistic-data-structures-f78691e2991d
- http://www.cs.man.ac.uk/~david/courses/advalgorithms/probabilistic.pdf
