This project develops an open portal for goal-directed vision research.
- COCO-Search18 dataset is available at https://sites.google.com/view/cocosearch/home.
The environment dependencies are specified in environment.yaml. To set up the environment using conda, run:
conda env create -f environment.yaml
conda activate ActiveVisionAlternatively, for Compute Canada users, see the detailed section below.
The main entry point is main.py, which provides a unified interface for training and evaluating different gaze prediction models.
- List all available models:
python main.py --list_models- Train a model:
python main.py --model <model_name> --train --dataset datasets/COCO-Search18- Evaluate a model:
python main.py --model <model_name> --eval --dataset datasets/COCO-Search18- Show model-specific help:
python main.py --model <model_name> --help_modelYou may also specify the dataset directory via --dataset, e.g., --dataset datasets/COCO-Search18.
Due to storage constraints, some required files are hosted on Compute Canada. Please place them in exactly the same directory structure as referenced in the code. For example:
ActiveVisionPortal/
├── datasets/
│ └── COCO-Search18/
│ └── ...
├── models/
│ └── model 1/
│ └── checkpoint(s)/
│ └── data/
│ └── pretrained_models(if applicable)/
│ └── model/
│ └── entry.py
│ └── ...
│ └── model 2/
│ └── ...
Please place the entire project directory in a location of your choice on Compute Canada, which matches the directory structure shown in Required Files.
For example, you can use: ~/scratch/ActiveVisionPortal/
- Run the provided setup script:
bash setup_env.sh- Compile Custom CUDA Operators:
cd ~/scratch/ActiveVisionPortal/models/HAT/model/pixel_decoder/ops/
dos2unix make.sh- Allocate a GPU node:
salloc --gres=gpu:1 --mem=16G --cpus-per-task=4 --time=01:00:00- Inside the GPU shell:
sh make.sh
exitNavigate to your project root (e.g., where main.py is located):
cd ~/scratch/ActiveVisionPortalThen follow the instructions in the Running the Framework section below to train or evaluate a model.