Modality-Guided Dynamic Graph Fusion and Temporal Diffusion for Self-Supervised RGB-T Tracking
-
Clone the repository locally
-
Create a conda environment
conda env create -f env.yaml
conda activate GDSTrack
pip install --upgrade git+https://github.com/got-10k/toolkit.git@master
# (You can use `pipreqs ./root` to analyse the requirements of this project.)
- Prepare the training data:
We use LMDB to store the training data. Please check
./data/parse_<DATA_NAME>
and generate lmdb datasets. - Specify the paths in
./lib/register/paths.py
.
All of our models are trained on a single machine with two RTX3090 GPUs. For distributed training on a single node with 2 GPUs:
- MAT pre-training
python -m torch.distributed.launch --nproc_per_node=2 train.py --experiment=translate_template --train_set=common_pretrain
- Tracker training
Modify the cfg.model.backbone.weights
in ./config/cfg_translation_track.py
to be the last checkpoint of the MAT pre-training.
python -m torch.distributed.launch --nproc_per_node=2 train.py --experiment=translate_track --train_set=common
We have released the evaluation results on GTOT, RGB-T234, LasHeR, and VTUAV-ST in results.
@misc{GDSTrack_2025_IJCAI,
title={Modality-Guided Dynamic Graph Fusion and Temporal Diffusion for Self-Supervised RGB-T Tracking},
author={Shenglan Li and Rui Yao and Yong Zhou and Hancheng Zhu and Kunyang Sun and Bing Liu and Zhiwen Shao and Jiaqi Zhao},
year={2025},
eprint={2505.03507},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2505.03507},
}
- Thanks for the great MAT, USOT, GMMT.
- For data augmentation, we use Albumentations.
This work is released under the GPL 3.0 license. Please see the LICENSE file for more information.