Releases: coregx/ahocorasick
v0.2.1 — DFA backend with SIMD prefilter
Major performance release: DFA compilation with SIMD-accelerated prefilter.
Highlights
- Up to 7.4 GB/s throughput on
IsMatch(14x improvement over v0.1.0) - Up to 4.0 GB/s on
Find(13x improvement) - Zero heap allocations on
IsMatchhot path - 71% fewer allocations in
FindAll
What changed
The search backend is now a fully compiled DFA with a flat []uint32 transition table and premultiplied state IDs. All failure transitions are pre-computed at build time, so search is a single table lookup per byte.
A SIMD-accelerated prefilter using bytes.IndexByte (SSE2/AVX2 on amd64) scans for pattern start bytes before and during DFA traversal, skipping entire non-matching regions in bulk.
Benchmarks
Intel i7-1255U, 64KB haystack, 4-7 patterns:
| Method | v0.1.0 | v0.2.1 | Improvement |
|---|---|---|---|
Find |
300 MB/s | 4.0 GB/s | 13x |
IsMatch (no match) |
260 MB/s | 6.5 GB/s | 25x |
IsMatch (match@32KB) |
545 MB/s | 7.4 GB/s | 14x |
FindAll (77B, 10 matches) |
40 MB/s, 14 alloc | 125 MB/s, 4 alloc | 3x |
Breaking changes
None. Public API is fully backward-compatible with v0.1.0.
Full changelog
https://github.com/coregx/ahocorasick/blob/v0.2.1/CHANGELOG.md
v0.1.0: Initial Release
ahocorasick v0.1.0
High-performance Aho-Corasick multi-pattern string matching for Go.
Highlights
- 1+ GB/s throughput — comparable to Rust's aho-corasick
- Zero dependencies — pure Go, no cgo
- Zero allocations —
IsMatch()hot path allocates nothing
Performance
Benchmarks on Intel i7-1255U (64KB haystack, 4 patterns):
| Method | Throughput | Allocations |
|---|---|---|
IsMatch (with match) |
1.6 GB/s | 0 |
Find |
1.1 GB/s | 1 |
IsMatch (no match) |
780 MB/s | 0 |
Installation
go get github.com/coregx/ahocorasick@v0.1.0Quick Example
ac, _ := ahocorasick.NewBuilder().
AddStrings([]string{"error", "warning", "fatal"}).
Build()
if ac.IsMatch([]byte("an error occurred")) {
fmt.Println("Found match!")
}What's Included
- Builder pattern — fluent API for configuration
- Multiple match methods —
Find,FindAll,FindAllOverlapping,IsMatch,Count - Match semantics —
LeftmostFirst(Perl) andLeftmostLongest(POSIX) - Optimizations — dense transitions, byte classes, precomputed root
Testing
- 27 unit tests + 2 fuzz tests
- 93% code coverage
- CI on Linux, Windows, macOS with race detector
See CHANGELOG.md for full details.