Skip to content

lc542/ActiveVisionPortal

Repository files navigation

ActiveVisionPortal

About

This project develops an open portal for goal-directed vision research.

Datasets

Environment Setup

The environment dependencies are specified in environment.yaml. To set up the environment using conda, run:

  conda env create -f environment.yaml
  conda activate ActiveVision

Alternatively, for Compute Canada users, see the detailed section below.

Running the Framework

The main entry point is main.py, which provides a unified interface for training and evaluating different gaze prediction models.

Example Commands

  • List all available models:
  python main.py --list_models
  • Train a model:
  python main.py --model <model_name> --train --dataset datasets/COCO-Search18
  • Evaluate a model:
  python main.py --model <model_name> --eval --dataset datasets/COCO-Search18
  • Show model-specific help:
  python main.py --model <model_name> --help_model

You may also specify the dataset directory via --dataset, e.g., --dataset datasets/COCO-Search18.

Required Files

Due to storage constraints, some required files are hosted on Compute Canada. Please place them in exactly the same directory structure as referenced in the code. For example:

ActiveVisionPortal/
├── datasets/
│   └── COCO-Search18/
│   └── ...
├── models/
│   └── model 1/
│       └── checkpoint(s)/
│       └── data/
│       └── pretrained_models(if applicable)/
│       └── model/
│       └── entry.py
│       └── ...
│   └── model 2/
│   └── ...

Setting Up on Compute Canada

Step 1: Prepare Project Directory

Please place the entire project directory in a location of your choice on Compute Canada, which matches the directory structure shown in Required Files.

For example, you can use: ~/scratch/ActiveVisionPortal/

Step 2: Create Environment

  • Run the provided setup script:
  bash setup_env.sh
  • Compile Custom CUDA Operators:
  cd ~/scratch/ActiveVisionPortal/models/HAT/model/pixel_decoder/ops/
  dos2unix make.sh
  • Allocate a GPU node:
  salloc --gres=gpu:1 --mem=16G --cpus-per-task=4 --time=01:00:00
  • Inside the GPU shell:
  sh make.sh
  exit

Step 3: Running

Navigate to your project root (e.g., where main.py is located):

  cd ~/scratch/ActiveVisionPortal

Then follow the instructions in the Running the Framework section below to train or evaluate a model.

About

This project is the work for the Google Summer of Code 2025, with the organization INCF.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages