Simple ViT and Evolving Harness for Explainable Text Forgery Detection
🏆 3rd Place Solution for the ACM MM 2026 GenText-Forensics Challenge — detecting, localizing, and explaining text-centric document forgeries.
SEED is a modular forgery analysis pipeline with three stages:
| Stage | Component | Description |
|---|---|---|
| 1️⃣ | Synthetic Data | Similarity-guided forgery generation across 5 manipulation types, with paired (clean, forged) sampling |
| 2️⃣ | ViT Detector | DINOv3 ViT-L/16 + LoRA adaptation + EoMT mask head — unified detection & localization |
| 3️⃣ | Meta-Harness | Evolving MLLM harness that converts detector outputs into structured forensic reports |
.
├── base_trainer.py # Training utilities & metrics
├── cfg.py # Runtime configuration
├── ds.py # Datasets & dataloaders
├── main.py # Train / validation / inference entry point
├── model/
│ ├── eomt_sep_query.py # Main detector (DINOv3 + LoRA + EoMT)
│ ├── lora.py # Single-expert LoRA modules
│ ├── mask_classification_loss.py # Mask2Former-style loss
│ └── scale_block.py # ConvTranspose upscaling block
├── meta_harness/
│ ├── test_submission.py # Generate challenge-format reports
│ ├── precompute_submission_artifacts.py
│ ├── harness.py # Report generation base class
│ ├── llm_clients.py # OpenAI-compatible LLM client
│ ├── overlay.py # Mask visualization helpers
│ ├── report_utils.py # Report formatting utilities
│ ├── template_report_boxreasons_coordspanrepair.py
│ └── config.yaml # LLM API configuration
└── TDOC/ # Auxiliary training & generation modules
# Create a Python 3.10 environment, then install dependencies
pip install -r requirements.txt| Dataset | Description | Link |
|---|---|---|
| RealText-V2 | Original challenge dataset | vankey/RealText-V2 |
| RealText-V2-Syn25k | Our synthetic data | Jason37437/RealText-V2-Syn25k |
| Cross-domain test sets | T-SROIE, OSTF, TPIC-13, RTM | Google Drive |
| Model Checkpoint | SEED (LoRA rank-1, DINOv3 ViT-L) | Jason37437/SEED / Google Drive |
# 1. Edit cfg.py → mode='train'
# 2. Set your data paths and GPU count
python main.py# 1. Edit cfg.py → mode='val', eval_mode=['loc','det']
python main.pyThe meta_harness/ pipeline converts detector outputs → structured Markdown forensic reports.
# 🔑 Set your OpenAI-compatible API key
export LINKAPI_API_KEY="your-api-key-here"
# 🖼️ Step 1: Precompute overlays, bounding boxes, data URIs
python meta_harness/precompute_submission_artifacts.py
# 📝 Step 2: Generate reports via MLLM
python meta_harness/test_submission.py