Skip to content

Tangkfan/CICR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Boosting Temporal Sentence Grounding via Causal Inference

Prerequisites

This work was tested with Python 3.8.12, CUDA 11.3, and Ubuntu 18.04.

Conda Environment

conda create -n CICR python=3.8
conda activate CICR
conda install pytorch==1.11.0 torchvision==0.12.0 cudatoolkit=11.3 -c pytorch
pip install -r requirements.txt
# You should also download nltk_data.
python -c "import nltk; nltk.download('all')"

Data Preparation

The structure of the data folder is as follows:

data
├── charades
│   ├── annotations
│   │   ├── charades_sta_test.txt
│   │   ├── charades_sta_train.txt
│   │   ├── Charades_v1_test.csv
│   │   ├── Charades_v1_train.csv
│   │   ├── CLIP_tokenized_count.txt
│   │   ├── GloVe_tokenized_count.txt
│   │   └── glove.pkl
│   ├── charades_query_object_glove_train.pkl
│   ├── charades_query_subject_glove_train.pkl
│   └── charades_query_relation_glove_train.pkl
├── Charades-CD
│   ├── charades_test_iid.json
│   ├── charades_test_ood.json
│   ├── charades_train.json
│   ├── charades_val.json
│   ├── charades_query_object_glove_train.pkl
│   ├── charades_query_subject_glove_train.pkl
│   └── charades_query_relation_glove_train.pkl
│   └── glove.pkl -> ../charades/annotations/glove.pkl
├── Charades-CG
│   ├── novel_composition.json
│   ├── novel_word.json
│   ├── test_trivial.json
│   ├── train.json
│   ├── CLIP_tokenized_count.txt -> ../charades/annotations/CLIP_tokenized_count.txt
│   └── glove.pkl -> ../charades/annotations/glove.pkl
├── qvhighlights
│   ├── annotations
│   │   ├── CLIP_tokenized_count.txt
│   │   ├── highlight_test_release.jsonl
│   │   ├── highlight_train_release.jsonl
│   │   ├── highlight_val_object.jsonl
│   │   └── highlight_val_release.jsonl
│   ├── qvhighlights_query_relation_train.pkl
│   ├── qvhighlights_query_subject_train.pkl
│   └── qvhighlights_query_object_train.pkl
├── tacos
│   ├── annotations
│   │   ├── CLIP_tokenized_count.txt
│   │   ├── GloVe_tokenized_count.txt
│   │   ├── test.json
│   │   ├── train.json
│   │   └── val.json
│   ├── tacos_query_object_glove_train.pkl
│   ├── tacos_query_relation_glove_train.pkl
│   └── tacos_query_subject_glove_train.pkl

All extracted features are converted to hdf5 files for better storage. You can use the provided python script ./data/npy2hdf5.py to convert *.npy or *.npz files to an hdf5 file.

CLIP_tokenized_count.txt & GloVe_tokenized_count.txt

These files are built for masked language modeling in FW-MESM, and they can be generated by running

python -m data.tokenized_count
  • CLIP_tokenized_count.txt

    Column 1 is the word_id tokenized by the CLIP tokenizer, column 2 is the times the word_id appears in the whole dataset.

  • GloVe_tokenized_count.txt

    Column 1 is the splited word in a sentence, column 2 is its tokenized id for GloVe, and column 3 is the times the word appears in the whole dataset.

Charades-STA

  • CLIP+SlowFast: We use the features provided by MESM.
  • I3D: We employ features delivered by VSLNet.
  • VGG: We utilize features supplied by 2D-TAN.

QVHighlights

We use the official feature files for QVHighlights dataset from Moment-DETR, and merge them to clip_image.hdf5 and slowfast.hdf5.

TACoS

Features are obtained from MESM.

Training

You can run train.py with args in command lines:

CUDA_VISIBLE_DEVICES=0 python train.py {--args}

Or run with a config file as input:

CUDA_VISIBLE_DEVICES=0 python train.py --config_file ./config/charades/VGG_GloVe.json

Evaluation

You can run eval.py with args in command lines:

CUDA_VISIBLE_DEVICES=0 python eval.py {--args}

Or run with a config file as input:

CUDA_VISIBLE_DEVICES=0 python eval.py --config_file ./config/charades/VGG_GloVe_eval.json

About

The official code of Boosting Temporal Sentence Grounding via Causal Inference (MM2025)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages