Official Code for the following paper:
X. Wang, A. Katsenou, and D. Bull. ReLaX-VQA: Residual Fragment and Layer Stack Extraction for Enhancing Video Quality Assessment
The paper “Frame Differences Matter in Quality Assessment of Compressed Videos” was accepted by the IEEE 25th International Conference on Digital Signal Processing (DSP 2025).
Try our online demo on Hugging Face 🤗: https://huggingface.co/spaces/xinyiW915/ReLaX-VQA
We evaluate the performance of ReLaX-VQA on four datasets. ReLaX-VQA has three different versions based on the training and testing strategies:
- ReLaX-VQA: Trained and tested on each dataset with an 80%-20% random split.
- ReLaX-VQA (w/o FT): Trained on LSVQ, and the frozen model was tested on other datasets.
- ReLaX-VQA (w/ FT): Trained on LSVQ, and the frozen model was fine-tuned on other datasets.
Model | CVD2014 | KoNViD-1k | LIVE-VQC | YouTube-UGC |
---|---|---|---|---|
ReLaX-VQA | 0.8643 | 0.8535 | 0.7655 | 0.8014 |
ReLaX-VQA (w/o FT) | 0.7845 | 0.8312 | 0.7664 | 0.8104 |
ReLaX-VQA (w/ FT) | 0.8974 | 0.8720 | 0.8468 | 0.8469 |
Model | CVD2014 | KoNViD-1k | LIVE-VQC | YouTube-UGC |
---|---|---|---|---|
ReLaX-VQA | 0.8895 | 0.8473 | 0.8079 | 0.8204 |
ReLaX-VQA (w/o FT) | 0.8336 | 0.8427 | 0.8242 | 0.8354 |
ReLaX-VQA (w/ FT) | 0.9294 | 0.8668 | 0.8876 | 0.8652 |
More results can be found in reported_result.ipynb.
The figure shows the overview of the proposed ReLaX-VQA framework. The architectures of ResNet-50 Stack (I) and ResNet-50 Pool (II) are provided in Fig.2 in the paper.
The repository is built with Python 3.10.14 and can be installed via the following commands:
git clone https://github.com/xinyiW915/ReLaX-VQA.git
cd ReLaX-VQA
conda create -n relaxvqa python=3.10.14 -y
conda activate relaxvqa
pip install -r requirements.txt
The corresponding raw video datasets can be downloaded from the following sources:
LSVQ, KoNViD-1k, LIVE-VQC, YouTube-UGC, CVD2014.
The metadata for the experimented UGC dataset is available under ./metadata
.
Once downloaded, place the datasets in ./ugc_original_videos
or any other storage location of your choice.
Ensure that the video_path
in the get_video_paths
function inside main_relaxvqa_feats.py
is updated accordingly.
Run the pre-trained models to evaluate the quality of a single video.
The model weights provided in ./model
contain the best-performing saved weights from training.
To evaluate the quality of a specific video, run the following command:
python demo_test_gpu.py
-device <DEVICE>
-train_data_name <TRAIN_DATA_NAME>
-is_finetune <True/False>
-save_path <MODEL_PATH>
-video_type <DATASET_NAME>
-video_name <VIDEO_NAME>
-framerate <FRAMERATE>
Or simply try our demo video by running:
python demo_test_gpu.py
Steps to train ReLaX-VQA from scratch on different datasets.
Run the following command to extract features from videos:
python main_relaxvqa_feats.py -device gpu -video_type youtube_ugc
Train our model using extracted features:
python model_regression_simple.py -data_name youtube_ugc -feature_path ../features/ -save_path ../model/
For LSVQ, train the model using:
python model_regression.py -data_name lsvq_train -feature_path ../features/ -save_path ../model/
To fine-tune the pre-trained model on a new dataset, modify train_data_name
to match the dataset used for training, and test_data_name
to specify the dataset for fine-tuning.
python model_finetune.py
A detailed analysis of different components in ReLaX-VQA.
Key techniques used in ReLaX-VQA:
-
Fragmentation with DNN layer stacking:
python feature_fragment_layerstack.py
-
Fragmentation with DNN layer pooling:
python feature_fragment_pool.py
-
Frame with DNN layer stacking:
python feature_layerstack.py
-
Frame with DNN layer pooling:
python feature_pool.py
We exclude greyscale videos in our experiments. You can use check_greyscale.py
to filter out greyscale videos from the VQA dataset you want to use.
python check_greyscale.py
For easy extraction of metadata from your VQA dataset, use:
python extract_metadata_NR.py
This work was funded by the UKRI MyWorld Strength in Places Programme (SIPF00006/1) as part of my PhD study.
If you find this paper and the repo useful, please cite our paper 😊:
@article{wang2024relax,
title={ReLaX-VQA: Residual Fragment and Layer Stack Extraction for Enhancing Video Quality Assessment},
author={Wang, Xinyi and Katsenou, Angeliki and Bull, David},
year={2024},
eprint={2407.11496},
archivePrefix={arXiv},
primaryClass={eess.IV},
url={https://arxiv.org/abs/2407.11496}}
@INPROCEEDINGS{11075040,
author={Wang, Xinyi and Katsenou, Angeliki and Bull, David},
booktitle={2025 25th International Conference on Digital Signal Processing (DSP)},
title={Frame Differences Matter in Quality Assessment of Compressed Videos},
year={2025},
volume={},
number={},
pages={1-5},
keywords={User-generated content;Digital signal processing;Transcoding;Video compression;Transformers;Quality assessment;Spatiotemporal phenomena;Video recording;Testing;Residual neural networks;No-Reference Video Quality Assessment;User-generated Content;Deep Features;Video Compression},
doi={10.1109/DSP65409.2025.11075040}}
Xinyi WANG, xinyi.wang@bristol.ac.uk