TOSA-HMD

This directory contains the core implementation for the paper method on the Hateful Memes Challenge dataset with Qwen2-VL.

It includes the CLIP aligned encoder, consensus-distinctive decomposition, complementary fusion, soft-prompt injection into Qwen2-VL, training, and evaluation. Scripts for all paper tables are not included.

This release is intended as a compact core-method implementation. It does not include the multi-dataset, multi-backbone, PEFT, ablation, case-study, visualization, or historical experiment scripts used during the broader study.

Data Format

The training file can be a JSON file with train and dev splits or a JSONL file. For JSONL, records are filtered by the split field when it is present; otherwise the file is treated as a single-split file. Each example should contain:

{
  "id": "42953",
  "img": "img/42953.png",
  "text": "its their character not their color that matters",
  "label": "not-hateful"
}

Labels are not-hateful and hateful.

Installation

pip install -r requirements.txt

Training

Edit the paths in scripts/train_hmc.sh, then run:

bash scripts/train_hmc.sh

The defaults follow the manuscript settings where specified: CLIP image size is 224, the MLLM backbone is Qwen2-VL-2B-Instruct, the optimizer is Adam, the learning rate is 1e-6, the effective batch size is 32, the decomposition loss coefficients are 1, and the complementary-fusion window size and stride are 2.

This release trains the structural adapter only. Qwen2-VL and CLIP are frozen during training, while the decomposition, complementary-fusion, and prompt-projection modules are updated.

The final release checkpoint stores adapter_model.bin, adapter_config.json, and processor files. Qwen2-VL and CLIP base weights are loaded from the model names or paths recorded in adapter_config.json.

Evaluation

Edit the paths in scripts/evaluate_hmc.sh, then run:

bash scripts/evaluate_hmc.sh

The evaluation script reports accuracy, macro-F1, and AUROC on the selected split. For HMC, use the development split when following the manuscript protocol.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
configs		configs
scripts		scripts
src		src
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TOSA-HMD

Data Format

Installation

Training

Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

TOSA-HMD

Data Format

Installation

Training

Evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages