This repository contains the implementation code for the paper:
Exploring Spatial Frequency Information for Enhanced Video Prediction Quality
API/contains dataloaders and metrics.cls/contains the implement of FATranslator.model_build.pycontains the SDFNet model.run.pyis the executable python file with possible arguments.experiment_cfg.pyis the core file for model training, validating, and testing.configs.pyis the parameter configuration.TDFL.pycontains the implement of 3DFL.
We firstly introduce a novel objective metric called 3D frequency loss (3DFL) based on 3D fast Fourier transform (3DFFT). As a DNN model-free and training-free metric, 3DFL provides an objective and rational approach to evaluate the similarity and absolute distance between videos. We provide visual comparisons of several traditional metrics on the KTH dataset.
This validation demonstrates that 3DFL, as a metric measuring absolute errors, has perceptual capabilities similar to LPIPS for assessing natural video quality, confirming its effectiveness as a reliable metric for evaluating video prediction performance. You can find more examples in the Multimedia_Files/1Metric_Vaildation folder.
We provide the environment requirements file for easy reproduction:
conda create -n SDFNet python=3.7
conda activate SDFNet
pip install -r requirements.txt
Our model has been experimented on the following four datasets:
We provide a download script for the Moving MNIST dataset:
cd ./data/moving_mnist
bash download_mmnist.sh
This example provide the detail implementation on Moving MNIST, you can easily reproduce our work using the following command:
conda activate SDFNet
python run.py
Please note that the model training must strictly adhere to the hyperparameter settings provided in our paper; otherwise, reproducibility may not be guaranteed.
SDFNet predicts more accurate actions with less motion blurring compared to other models. Here are some qualitative visualization examples:
More examples are available at Multimedia_Files/2KTH_Visualization folder.
More examples are available at Multimedia_Files/3Human_Visualization folder.




