This is the official repository of TAM-VT. We now support training and testing on VOST dataset. This work is still under submission. Please stay tune there!
Ubuntu 20.04, python version 3.9, cuda version 11.3, pytorch 1.12.1
- conda create -n vost python=3.9
- conda activate vost
- conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch
- pip install -r requirements.txt
- python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
Please visit VOST official website for downloading the dataset.
You can also download the training and validation set videos and annotations under this link.
Please refer to the sbatch scripts in scripts/vost_train.sh
for more details.
-
Please first modify the config to match the dataset path.
-
Please download the pre-trained weight on static images and put the checkpoint in the
checkpoints/
. -
Run the training script.
bash scripts/vost_train.sh
Please refer to the sbatch scripts in scripts/vost_eval.sh
for more details.
-
Please download the model weight on VOST and put the checkpoint in the
checkpoints/
. -
Run the training script.
bash scripts/vost_eval.sh
To evaluate the predicted results using VOST evaluation scripts, we first need to obtain the predicted results in the VOST format.
- first add these augs into the eval script:
eval=True
eval_flags.plot_pred=True
eval_flags.vost.vis_only_no_cache=True
eval_flags.vost.vis_only_pred_mask=True
This allows you to predict the results iteratively and plot all the results. The results will be saved in the defined "output" directory with the following file structure.
tamvt_eval
└───plot_pred
│ └───555_tear_aluminium_foil
│ │ └───object_id_1
│ │ └───frames
│ │ │ └───frame00012.png
│ │ │ │ ...
│ │ │ └───frame00600.png
│ │ └───reference_crop.jpg
│ └───556_cut_tomato
│ │ ...
│ └───10625_knead_dough
│
- To get the evaluation score, please follow the protocol in the VOST repo.
python3 evaluation/evaluation_method.py --set val --dataset_path [PATH_TO_VOST_DATASET] --results_path [PATH_TO_PRED_DIR]
Example:
python3 evaluation/evaluation_method.py --set val --dataset_path ../datasets/VOST/VOST/ --results_path [PATH_TO_plot_pred_DIR] ./checkpoints/tamvt_eval/plot_pred
You might need to
pip install pandas
to install the package.Please use a cluster with at least 32 GB of memory and an 8+ core CPU, or it will reach the memory limit and be terminated.
If you find TAM-VT useful for your research and applications, please cite using this BibTeX:
@article{goyal2023m3t,
title={TAM-VT: Transformation-Aware Multi-scale Video Transformer for Segmentation and Tracking},
author={Goyal, Raghav and Fan, Wan-Cyuan and Siam, Mennatullah and Sigal, Leonid},
journal={arXiv preprint arXiv:2312.08514},
year={2023}
}
@misc{goyal2023tamvt,
title={TAM-VT: Transformation-Aware Multi-scale Video Transformer for Segmentation and Tracking},
author={Raghav Goyal and Wan-Cyuan Fan and Mennatullah Siam and Leonid Sigal},
year={2023},
eprint={2312.08514},
archivePrefix={arXiv},
primaryClass={cs.CV}
}