Skip to content

Add runnable V1/V1.5 dependency-tree generation pipeline#1

Open
Vic1025 wants to merge 1 commit into
mainfrom
add-v1-implementation
Open

Add runnable V1/V1.5 dependency-tree generation pipeline#1
Vic1025 wants to merge 1 commit into
mainfrom
add-v1-implementation

Conversation

@Vic1025

@Vic1025 Vic1025 commented Jun 17, 2026

Copy link
Copy Markdown
Owner

This PR fills in the previously code-less repo with a full, runnable reference implementation of the documented V1 / V1.5 Math Word Problem Generator pipeline.

What's included

Pure standard-library Python (no external dependencies, no API key, runs fully offline and deterministically with --seed):

File Pipeline stage
src/mwp_v1/variable_tree.py Stages 1-3: random variables → dependency tree → operation assignment → ground-truth answer + answer_list
src/mwp_v1/themes.py Stage 4: theme library (behavior / unit / noun categories)
src/mwp_v1/entity_library.py Stage 5 (V1.5): pre-built entity library (themes → entity lists + questioning templates)
src/mwp_v1/renderer.py Stage 6: NarrativeRenderer (V1, optional LLM hook) + ListingRenderer (compact V1.5)
src/mwp_v1/validator.py Stage 7: classify → independent recompute → correct
src/mwp_v1/pipeline.py End-to-end batch generation + dataset stats
generate.py CLI entry point

Verification

PYTHONPATH=src python3 generate.py --op-level 10 --depth 7 --width 3 --n 20 --seed 1 --out data.jsonl
PYTHONPATH=src python3 -m unittest tests.test_variable_tree tests.test_pipeline

28 tests passing. Every accepted record's independently-recomputed answer equals the tree ground truth. Real samples in examples/, architecture notes in USAGE.md.

🤖 Generated with Claude Code

Implements the documented Math Word Problem Generator pipeline in pure
standard-library Python (no external deps, runs fully offline):

- variable_tree.py  : stages 1-3 (random variables -> dependency tree ->
                      operation assignment -> ground-truth answer + answer_list)
- themes.py         : stage 4 theme library (behavior/unit/noun categories)
- entity_library.py : stage 5 (V1.5) pre-built entity library
- renderer.py       : stage 6 narrative (V1) + compact listing (V1.5) renderers
- validator.py      : stage 7 classify -> recompute -> correct
- pipeline.py       : end-to-end batch generation + dataset stats
- generate.py CLI, tests (28 passing), examples, USAGE.md

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant