Authors: H. Umut Suluhan, Abdullah Enes Doruk, Hasan F. Ates, and Bahadir K. Gunturk
This repository is the official PyTorch implementation of "HSTR-Net: Reference Based Video Super-resolution with Dual Cameras"
High-spatio-temporal resolution (HSTR) video recording plays a crucial role in enhancing various imagery tasks that require fine-detailed information. State-of-the-art cameras provide this required high frame-rate and high spatial resolution together, albeit at a high cost. To alleviate this issue, this paper proposes a dual camera system for the generation of HSTR video using reference-based superresolution (RefSR). One camera captures high spatial resolution low frame rate (HSLF) video while the other captures low spatial resolution high frame rate (LSHF) video simultaneously for the same scene. A novel deep learning architecture is proposed to fuse HSLF and LSHF video feeds and synthesize HSTR video frames. The proposed model combines optical flow estimation and (channel-wise and spatial) attention mechanisms to capture the fine motion and complex dependencies between frames of the two video feeds. Simulations show that the proposed model provides significant improvement over existing reference-based SR techniques in terms of PSNR and SSIM metrics. The method also exhibits sufficient frames per second (FPS) for aerial monitoring when deployed on a power-constrained drone equipped with dual cameras.
- Python 3.8, PyTorch >= 1.7.1
- CUDA 11.x
a. Create environment
conda env create -f environment.ymlb. Activate environment
conda activate hstrnetDownloads Vimeo, Vizdrone, and MAMI datasets.
HSTRNet
├── data/vimeo
├── sequences
├── tri_testlist.txt
├── tri_trainlist.txt
├── vimeo_triplet_lr
├── x4_downsampled_sequences
└── x8_downsampled_sequences
data/vizdrone
├── normal
└── upsampled
data/MAMI
├── test
├── train
- By default, we assume you have downloaded the file in the
pretraineddir. - For Downloading all checkpoints, you can run
pretrained.sh.
bash pretrained.sh| Dataset | PSNR | Ifnet | Unet | Attention | Contextnet |
|---|---|---|---|---|---|
| Vimeo | 38.45 | Link | Link | Link | Link |
| Vizdrone | 33.30 | Link | Link | Link | Link |
| MAMI | 25.34 | Link | Link | Link | Link |
a. Vimeo Training
python train/vimeo.py --dataset data/vimeo --epoch 100 --lr 0.0001 --train_bs 16 --val_bs 4 --workers 4b. Vizdrone Training
python train/vizdrone.py --dataset data/vizdrone --epoch 100 --lr 0.0001 --train_bs 16 --val_bs 4 --workers 4c. MAMI Training
python train/mami.py --dataset data/MAMI --epoch 100 --lr 0.0001 --train_bs 16 --val_bs 4 --workers 4@article{suluhan2025hstr,
title={Hstr-net: reference based video super-resolution with dual cameras},
author={Suluhan, H Umut and Doruk, Abdullah Enes and Ates, Hasan F and Gunturk, Bahadir K},
journal={Machine Vision and Applications},
volume={36},
number={3},
pages={69},
year={2025},
publisher={Springer}
}


