Skip to content

Commit 6487ca5

Browse files
deftioclaude
andcommitted
Release v1.0.6: fix CI flaky test, add CLI commands, Python binding
Bug fixes: - Encoder: deep-copy string/blob value data to prevent use-after-return when JSON encoder passes stack-allocated buffers (root cause of CI flaky test on ubuntu-latest/clang) - JSON decoder: fix heap-buffer-overflow in segment dedup - Bitstream: fix UB in zigzag encode (left-shift of negative int64_t) New features: - CLI tool: trp encode/decode/validate commands with stdin and -o support - Native Python binding with 70 tests - Complex JSON example (json_complex.c) - 3 new C test files, expanded edge-case coverage (~400 total tests) - README: CI/coverage/license badges, roadmap, updated bindings table Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 39550df commit 6487ca5

44 files changed

Lines changed: 23055 additions & 176 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

CHANGELOG.md

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,76 @@ All notable changes to this project will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [1.0.6] - 2026-03-02
9+
10+
### Fixed
11+
- Encoder: deep-copy string/blob value data to prevent use-after-return when
12+
JSON encoder passes stack-allocated buffers (root cause of CI flaky test)
13+
- JSON decoder: fix heap-buffer-overflow in segment dedup that used entry index
14+
as byte offset into key string
15+
- Bitstream: fix undefined behavior in zigzag encode (left-shift of negative
16+
value); cast to uint64_t before shifting
17+
18+
### Added
19+
- README: CI build, coverage, and BSD-2-Clause license badges
20+
- README: roadmap section with planned milestones for language bindings,
21+
format enhancements, and tooling
22+
- CLI tool: `trp encode`, `trp decode`, `trp validate` commands with stdin
23+
support and `-o`/`--pretty` flags
24+
25+
### Changed
26+
- README: updated language bindings table (Python, JavaScript now implemented)
27+
- README: updated project status to v1.0.5 test counts
28+
- CLI tool: removed `trp json` command (replaced by `trp decode`)
29+
30+
## [1.0.5] - 2026-03-02
31+
32+
### Added
33+
- Native Python binding: pure-Python `.trp` encoder/decoder with byte-for-byte compatibility
34+
- Python test suite: 70 tests across 5 files (crc32, bitstream, varint, roundtrip, fixtures)
35+
- Complex JSON example (`json_complex.c`): nested objects, arrays, DOM lookups, pretty-print
36+
- Test data files: `common_words_10k.txt` (10K words) and `benchmark_100k.json` (202 KB)
37+
- Generator script: `tools/generate_benchmark_json.py`
38+
- 3 new C test files: `test_json_decode.c`, `test_core_internal.c`, `test_bitstream_errors.c`
39+
- Expanded existing test files with error-path and edge-case coverage
40+
41+
### Changed
42+
- C test suite: 16 test programs -> 20 test programs, ~330 individual tests
43+
- Total tests across all languages: ~400 (C/C++ + Python)
44+
- Updated testing documentation (`docs/guide/testing.md`)
45+
- Updated bindings README: Python and JavaScript marked as implemented
46+
47+
## [1.0.4] - 2026-03-02
48+
49+
### Fixed
50+
- Bitstream guide: clarify signed bit-field extraction with worked example
51+
52+
## [1.0.3] - 2026-03-02
53+
54+
### Fixed
55+
- CSS: move stylesheet to `assets/main.scss` so minima theme loads correctly
56+
- Remove hardcoded top bar above nav, add whitespace around section dividers
57+
- Adjust version label size and color for readability
58+
59+
## [1.0.2] - 2026-03-01
60+
61+
### Added
62+
- Native JavaScript `.trp` implementation: pure-JS encoder/decoder
63+
- Cross-language fixture files (7 `.trp` files) for interop testing
64+
- Cross-language test (`test_cross_language.c`) validating fixture files
65+
66+
### Fixed
67+
- 32-bit CI: enable Unity 64-bit type support, switch Pages to workflow build
68+
- Site margins: use 75% viewport width, fix `!important` overrides
69+
- clang-tidy: suppress `misc-no-recursion` false positive, fix dead store bug
70+
- GitHub Pages: fix lcov report overwriting docs site, fix broken links
71+
72+
## [1.0.1] - 2026-03-01
73+
74+
### Fixed
75+
- GitHub Pages deployment: docs site was showing lcov coverage report instead of Jekyll site
76+
- Fix broken navigation links on docs site
77+
878
## [1.0.0] - 2026-02-28
979

1080
### Added

CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
cmake_minimum_required(VERSION 3.16)
2-
project(triepack VERSION 1.0.0 LANGUAGES C CXX)
2+
project(triepack VERSION 1.0.6 LANGUAGES C CXX)
33

44
# ── Options ──────────────────────────────────────────────────────────────
55
option(BUILD_TESTS "Build test suite" ON)

README.md

Lines changed: 32 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11
# triepack v1.0.0
22

3+
[![CI Build & Test](https://github.com/deftio/triepack/actions/workflows/ci.yml/badge.svg)](https://github.com/deftio/triepack/actions/workflows/ci.yml)
4+
[![Coverage Report](https://github.com/deftio/triepack/actions/workflows/coverage.yml/badge.svg)](https://github.com/deftio/triepack/actions/workflows/coverage.yml)
5+
[![License: BSD-2-Clause](https://img.shields.io/badge/License-BSD_2--Clause-blue.svg)](LICENSE.txt)
6+
37
A compressed trie-based dictionary format for fast, compact key-value storage.
48

59
TriePack encodes dictionaries into a compact binary format (`.trp`) optimized for fast lookups, prefix search, and ROM-safe deployment. It uses prefix sharing and bit-level packing with configurable symbol encoding and full value type support.
@@ -94,16 +98,16 @@ Each layer can be used independently. `triepack_wrapper` provides C++11 RAII wra
9498

9599
## Language Bindings
96100

97-
Native implementations (not FFI) are planned for:
101+
All bindings are native implementations that read/write the `.trp` format directly (no FFI).
98102

99103
| Language | Status | Directory |
100104
|----------|--------|-----------|
101-
| Python | Scaffolded | `bindings/python/` |
102-
| TypeScript | Scaffolded | `bindings/typescript/` |
103-
| JavaScript | Scaffolded | `bindings/javascript/` |
104-
| Go | Scaffolded | `bindings/go/` |
105-
| Swift | Scaffolded | `bindings/swift/` |
106-
| Rust | Scaffolded | `bindings/rust/` |
105+
| Python | Implemented | `bindings/python/` |
106+
| JavaScript | Implemented | `bindings/javascript/` |
107+
| TypeScript | Not yet implemented | `bindings/typescript/` |
108+
| Go | Not yet implemented | `bindings/go/` |
109+
| Swift | Not yet implemented | `bindings/swift/` |
110+
| Rust | Not yet implemented | `bindings/rust/` |
107111

108112
## File Format
109113

@@ -126,9 +130,28 @@ See `docs/internals/` for format details.
126130

127131
## Project Status
128132

129-
**v1.0.0 released.** The bitstream library, trie codec, and JSON library are fully implemented. All 22 tests pass (16 unit test suites + 6 example integration tests). Run `compaction_benchmark` to see compression ratios on ~10k generated words.
133+
**v1.0.5 released.** Core C library (bitstream, trie codec, JSON), C++ wrappers, Python binding, and JavaScript binding are implemented. 27 test programs with ~400 individual tests across C, C++, and Python.
134+
135+
## Roadmap
136+
137+
### v1.1 — Client Libraries
138+
- [ ] TypeScript binding (native `.trp` reader/writer)
139+
- [ ] Go binding
140+
- [ ] Swift binding (with SPM package)
141+
- [ ] Rust binding (with crate on crates.io)
142+
- [ ] npm package for JavaScript/TypeScript
143+
- [ ] PyPI package for Python
144+
145+
### v1.2 — Format Enhancements
146+
- [ ] Suffix table (shared ending compression)
147+
- [ ] Huffman symbol encoding (for large dictionaries)
148+
- [ ] Nested dict values (embed sub-dictionaries inline)
130149

131-
Future work: suffix table, Huffman symbols, language bindings.
150+
### v1.3 — Tooling & Ecosystem
151+
- [ ] `trp` CLI: encode/decode/validate/inspect (in progress)
152+
- [ ] Fuzzy search (edit distance d<=2)
153+
- [ ] Performance benchmarks across languages
154+
- [ ] Language binding conformance test suite
132155

133156
## License
134157

bindings/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,9 @@ They do not use FFI or call into the C library.
55

66
| Language | Directory | Status |
77
|------------|----------------------------|---------------------|
8-
| Python | [python/](./python/) | Not yet implemented |
8+
| Python | [python/](./python/) | Implemented |
99
| TypeScript | [typescript/](./typescript/)| Not yet implemented |
10-
| JavaScript | [javascript/](./javascript/)| Not yet implemented |
10+
| JavaScript | [javascript/](./javascript/)| Implemented |
1111
| Go | [go/](./go/) | Not yet implemented |
1212
| Swift | [swift/](./swift/) | Not yet implemented |
1313
| Rust | [rust/](./rust/) | Not yet implemented |
Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
# Copyright (c) 2026 M. A. Chatterjee, BSD-2-Clause.
2+
3+
import pytest
4+
5+
from triepack.bitstream import BitReader, BitWriter
6+
7+
8+
def test_write_read_bits():
9+
w = BitWriter()
10+
w.write_bits(0x07, 3)
11+
w.write_bits(0xAB, 8)
12+
w.write_bits(0x05, 5)
13+
14+
r = BitReader(w.to_bytes())
15+
assert r.read_bits(3) == 0x07
16+
assert r.read_bits(8) == 0xAB
17+
assert r.read_bits(5) == 0x05
18+
19+
20+
def test_write_read_single_bit():
21+
w = BitWriter()
22+
w.write_bit(1)
23+
w.write_bit(0)
24+
w.write_bit(1)
25+
26+
r = BitReader(w.to_bytes())
27+
assert r.read_bit() == 1
28+
assert r.read_bit() == 0
29+
assert r.read_bit() == 1
30+
31+
32+
def test_u8_roundtrip():
33+
w = BitWriter()
34+
w.write_u8(0x00)
35+
w.write_u8(0xFF)
36+
w.write_u8(0x42)
37+
38+
r = BitReader(w.to_bytes())
39+
assert r.read_u8() == 0x00
40+
assert r.read_u8() == 0xFF
41+
assert r.read_u8() == 0x42
42+
43+
44+
def test_u16_roundtrip():
45+
w = BitWriter()
46+
w.write_u16(0xABCD)
47+
48+
r = BitReader(w.to_bytes())
49+
assert r.read_u16() == 0xABCD
50+
51+
52+
def test_u32_roundtrip():
53+
w = BitWriter()
54+
w.write_u32(0xDEADBEEF)
55+
56+
r = BitReader(w.to_bytes())
57+
assert r.read_u32() == 0xDEADBEEF
58+
59+
60+
def test_align_to_byte():
61+
w = BitWriter()
62+
w.write_bits(0x07, 3)
63+
w.align_to_byte()
64+
assert w.position == 8
65+
66+
w.write_u8(0x42)
67+
r = BitReader(w.to_bytes())
68+
r.read_bits(3)
69+
r.align_to_byte()
70+
assert r.read_u8() == 0x42
71+
72+
73+
def test_peek_bits():
74+
w = BitWriter()
75+
w.write_bits(0xAB, 8)
76+
77+
r = BitReader(w.to_bytes())
78+
assert r.peek_bits(8) == 0xAB
79+
assert r.position == 0 # didn't advance
80+
assert r.read_bits(8) == 0xAB
81+
assert r.position == 8
82+
83+
84+
def test_read_bytes():
85+
w = BitWriter()
86+
w.write_bytes(b"\xDE\xAD\xBE\xEF")
87+
88+
r = BitReader(w.to_bytes())
89+
assert r.read_bytes(4) == b"\xDE\xAD\xBE\xEF"
90+
91+
92+
def test_eof_raises():
93+
r = BitReader(b"\x00")
94+
r.read_bits(8)
95+
with pytest.raises(EOFError):
96+
r.read_bits(1)
97+
98+
99+
def test_remaining():
100+
r = BitReader(b"\x00\x00")
101+
assert r.remaining == 16
102+
r.read_bits(4)
103+
assert r.remaining == 12
104+
105+
106+
def test_seek():
107+
w = BitWriter()
108+
w.write_u8(0xAA)
109+
w.write_u8(0xBB)
110+
111+
r = BitReader(w.to_bytes())
112+
r.seek(8)
113+
assert r.read_u8() == 0xBB
114+
115+
116+
def test_is_aligned():
117+
r = BitReader(b"\xFF")
118+
assert r.is_aligned()
119+
r.read_bit()
120+
assert not r.is_aligned()
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
# Copyright (c) 2026 M. A. Chatterjee, BSD-2-Clause.
2+
3+
from triepack.crc32 import crc32
4+
5+
6+
def test_known_answer_123456789():
7+
"""Standard CRC-32 known-answer test."""
8+
data = b"123456789"
9+
assert crc32(data) == 0xCBF43926
10+
11+
12+
def test_empty():
13+
assert crc32(b"") == 0x00000000
14+
15+
16+
def test_single_byte():
17+
result = crc32(b"\x00")
18+
assert isinstance(result, int)
19+
assert 0 <= result <= 0xFFFFFFFF
20+
21+
22+
def test_all_zeros():
23+
result = crc32(b"\x00" * 256)
24+
assert isinstance(result, int)
25+
26+
27+
def test_all_ff():
28+
result = crc32(b"\xff" * 4)
29+
assert isinstance(result, int)

0 commit comments

Comments
 (0)