VERSION: v1.0.0-BETA-4
FileTrove walks a directory tree, identifies every file, computes metadata, and writes all results into a SQLite database with TSV export support.
| Category | Details |
|---|---|
| File type | MIME type, PRONOM identifier, format version, identification proof/note, extension — via siegfried |
| File & directory timestamps | Creation, modification, and access times |
| Hashes | MD5, SHA1, SHA256, SHA512, BLAKE2B-512 |
| Entropy | Shannon entropy (files up to 1 GB) |
| Extended attributes | xattr from ext3/ext4, btrfs, APFS, and others |
| EXIF metadata | Extracted from image files |
| YARA-X | Match results from your own rule files |
| NSRL | Flags known software files via the National Software Reference Library |
| Dublin Core | Optional session-level descriptive metadata |
Each file and directory gets a UUIDv4 as a unique identifier. All results land in a SQLite database and can be exported to TSV.
-
Get a distribution bundle — download from the releases page, or build one from source (see BUILDING.md):
task dist:bundle # builds binaries + bundles siegfried.sigThe bundle at
build/<os>_<arch>/contains everything you need. -
Run the installer from the bundle directory:
cd build/darwin_arm64 # or linux_amd64, etc. ./ftrove --install .
This creates the scan database (
db/filetrove.db) andlogs/directory. The siegfried signature file is included in the bundle. The NSRL bloom filter (~150–240 MB depending on variant) is downloaded automatically during install. Use--nsrl-variantto select which subset to download (default:all). -
You're ready.
Building from source without
task dist? You can build the NSRL bloom filter locally. See BUILDING.md for details ontask nsrl:build-alland disk space requirements.
YARA-X scanning requires a C library that is not bundled with FileTrove. It is built automatically during task build if not already present. See BUILDING.md for setup instructions.
- Example rule files:
testdata/yara/ - When a rule matches, the rule name, session UUID, and file UUID are recorded in the
yaratable. The rule file itself is not stored.
The NSRL bloom filter is not bundled in the repository. It is downloaded automatically during ftrove --install from the GitHub Releases page. Three variants are available:
| Variant | Subsets | Size |
|---|---|---|
modern |
Modern OS software | ~150 MB |
mobile |
Modern + Android + iOS | ~200 MB |
all |
Modern + Android + iOS + Legacy | ~240 MB |
./ftrove --install . --nsrl-variant all # default
./ftrove --install . --nsrl-variant modern # smallestNSRL checks are skipped gracefully if no bloom filter is present — scanning still works.
When NIST publishes a new RDS version, rebuild by updating NSRL_VERSION in Taskfile.nsrl.yml and running one of the build targets. See BUILDING.md for details.
You can also build a custom Bloom filter from any newline-delimited list of SHA1 hashes:
admftrove --creatensrl hashes.txt --nsrlversion "my-hashset-v1"Optional flags: --nsrl-estimate (expected hash count; auto-counted from file if omitted) and --nsrl-fpr (false positive rate, default 0.01). Copy the resulting nsrl.bloom into db/.
./ftrove -i $DIRECTORYFileTrove walks $DIRECTORY recursively. Run ./ftrove -h for all available flags.
List all sessions and export one to TSV:
./ftrove -l
./ftrove -t 926be141-ab75-4106-8236-34edfcf102f2You can also query the SQLite database directly:
- CLI:
sqlite3 db/filetrove.db - GUI: sqlitebrowser
- Visualisation: Sqliteviz
FileTrove is the successor of filedriller, based on the iPres 2021 paper Marrying siegfried and the National Software Reference Library.
