v0.2.0 is a significant quality release. The core context generation pipeline has been rearchitected, ranking is smarter, output is more token-efficient, and several silent failure modes have been eliminated. Existing usage is fully backward compatible — no flags changed, no output format broken.
Structured summaries replace raw source dumps
The most impactful change in this release. Prior to v0.2.0, Tier 1 files were included as raw source concatenated into CONTEXT.md. A single large file could consume tens of thousands of tokens while providing the same information an agent could get from a 150-token structured summary.
Tier 1 non-entry-point files now emit an AST-driven structured summary covering:
- Purpose — first line of the module docstring, or inferred from filename
- Depends on — internal imports only, stdlib and third-party filtered out
- Types — class names, base classes, public method names
- Functions — full signatures with one-line docstrings
- Notes — async-heavy modules, decorator-heavy files, partial parse warnings, large file flags
Entry point files (cli.py, main.py, app.py, etc.) retain full source — agents need to see the execution wiring directly.
Token impact on the codectx repo itself: ~40,000 tokens (v0.1.x) → ~6,600 tokens (v0.2.0). Same architectural information, 83% fewer tokens.
Task-aware ranking profiles
Five ranking profiles via --task:
| Profile | Signal emphasis |
|---|---|
| default | fan-in 40%, git frequency 40%, recency 10%, proximity 10% |
| debug | recency 50% — surfaces recently changed files first |
| feature | fan-in 50% — surfaces heavily depended-upon modules |
| architecture | fan-in 60%, proximity 15% — pure structural view |
| refactor | fan-in and symbol density — stable, complex, heavily imported files |
Average: 76% reduction. Every repo fits within a 128k context window. Naive baseline exceeds the limit for large codebases.