Paper β’ Supplementary
Welcome to DeepfakeBench, your one-stop solution for deepfake detection! Here are some key features of our platform:
π Table of Contentsβ Unified Platform: DeepfakeBench presents the first comprehensive benchmark for deepfake detection, resolving the issue of lack of standardization and uniformity in this field.
β Data Management: DeepfakeBench provides a unified data management system that ensures consistent input across all detection models.
β Integrated Framework: DeepfakeBench offers an integrated framework for the implementation of state-of-the-art detection methods.
β Standardized Evaluations: DeepfakeBench introduces standardized evaluation metrics and protocols to enhance the transparency and reproducibility of performance evaluations.
β Extensive Analysis and Insights: DeepfakeBench facilitates an extensive analysis from various perspectives, providing new insights to inspire the development of new technologies.
DeepfakeBench has the following features:
βοΈ Detectors (15 detectors):
- 5 Naive Detectors: Xception, MesoNet, MesoInception, CNN-Aug, EfficientNet-B4
- 7 Spatial Detectors: Capsule, DSP-FWA, Face X-ray, FFD, CORE, RECCE, UCF
- 3 Frequency Detectors: F3Net, SPSL, SRM
βοΈ Datasets (9 datasets): FaceForensics++, FaceShifter, DeepfakeDetection, Deepfake Detection Challenge (Preview), Deepfake Detection Challenge, Celeb-DF-v1, Celeb-DF-v2, DeepForensics-1.0, UADFV
DeepfakeBench will be continuously updated to track the latest advances in deepfake detection. The implementations of more detection methods, as well as their evaluations, are on the way. You are welcome to contribute your detection methods to DeepfakeBench.
(option 1) You can run the following script to configure the necessary environment:
git clone git@github.com:SCLBD/DeepfakeBench.git
cd DeepfakeBench
conda create -n DeepfakeBench python=3.7.2
conda activate DeepfakeBench
sh install.sh
(option 2) You can also utilize the supplied Dockerfile
to set up the entire environment using Docker. This will allow you to execute all the codes in the benchmark without encountering any environment-related problems. Simply run the following commands to enter the Docker environment.
docker build -t DeepfakeBench .
docker run --gpus all -itd -v /path/to/this/repository:/app/ --shm-size 64G DeepfakeBench
Note we used Docker version 19.03.14
in our setup. We highly recommend using this version for consistency, but later versions of Docker may also be compatible.
All datasets used in DeepfakeBench can be downloaded from their own websites or repositories. For convenience, we also provide the data we use in our research. All the downloaded datasets have been organized and arranged in the same folder. Users can easily access and download the preprocessed data, including original videos and corresponding mask videos, directly from we provided data, including:
Dataset Name | Download Link (Baidu Netdisk) | Extract Code | Notes |
---|---|---|---|
Celeb-DF-v1 | Download | wf2u | - |
Celeb-DF-v2 | Download | hqu1 | - |
FaceForensics++, DeepfakeDetection, FaceShifter | Download | mvgi | c23 version only |
UADFV | Download | r0gc | - |
Deepfake Detection Challenge (Preview) | Download | j1pq | - |
Deepfake Detection Challenge | Download | aktc | - |
DeepForensics-1.0 | Coming Soon | - | - |
FaceForensics++ (c40) | Coming Soon | - | - |
π‘οΈ Copyright of the above datasets belongs to their original providers.
Please note: We have encrypted and compressed the dataset, so you will need to enter the password: 123456
, to decompress each dataset file. Alternatively, you can directly run ./unzip.sh
file to decompress all compressed files (currently limited to .zip
format) in the ./datasets
folder.
It is important to note that the size of Deepfake Detection Challenge (DFDC)
dataset can be really large. So, you should decompress this dataset manually. Specifically, you should first navigate to the ./datasets
folder where the DFDC dataset is located and decompress each train_part.zip
file. You can do this by entering the password 123456
when prompted. Once all the train_part.zip
files are decompressed, you will see folders named dfdc_train_part_0
, dfdc_train_part_1
, ..., dfdc_train_part_49
. Enter the meta_files
folder inside the DFDC
folder. Copy the metadata.json file
from the meta_files
folder. Paste the metadata.json
file into each corresponding dfdc_train_part_X
folder. For example, copy metadata.json
from meta_files/dfdc_train_part_0/
folder and paste it into dfdc_train_part_0
.
Finally, the directory structure of DFDC should look like this:
DFDC
βββ dfdc_train_part_0
β βββ metadata.json
β βββ *.mp4
βββ dfdc_train_part_1
β βββ metadata.json
β βββ *.mp4
βββ ...
βββ dfdc_train_part_49
Other detailed information about the datasets used in DeepfakeBench is summarized below:
Dataset | Real Videos | Fake Videos | Total Videos | Rights Cleared | Total Subjects | Synthesis Methods | Perturbations | Original Repository |
---|---|---|---|---|---|---|---|---|
FaceForensics++ | 1000 | 4000 | 5000 | NO | N/A | 4 | 2 | Hyper-link |
FaceShifter | 1000 | 1000 | 2000 | NO | N/A | 1 | - | Hyper-link |
DeepfakeDetection | 363 | 3000 | 3363 | YES | 28 | 5 | - | Hyper-link |
Deepfake Detection Challenge (Preview) | 1131 | 4119 | 5250 | YES | 66 | 2 | 3 | Hyper-link |
Deepfake Detection Challenge | 23654 | 104500 | 128154 | YES | 960 | 8 | 19 | Hyper-link |
CelebDF-v1 | 408 | 795 | 1203 | NO | N/A | 1 | - | Hyper-link |
CelebDF-v2 | 590 | 5639 | 6229 | NO | 59 | 1 | - | Hyper-link |
DeepForensics-1.0 | 50000 | 10000 | 60000 | YES | 100 | 1 | 7 | Hyper-link |
UADFV | 49 | 49 | 98 | NO | 49 | 1 | - | Hyper-link |
Upon downloading your datasets, please ensure to store them in the ./datasets
folder, arranging them in accordance with the directory structure outlined below:
datasets
βββ FaceForensics++
β βββ original_sequences
β β βββ youtube
β β β βββ c23
β β β β βββ videos
β β β β β βββ *.mp4
β βββ manipulated_sequences
β β βββ Deepfakes
β β β βββ c23
β β β β βββ videos
β β βββ Face2Face
β β β βββ c23
β β β β βββ videos
β β βββ FaceSwap
β β β βββ c23
β β β β βββ videos
β β βββ NeuralTextures
β β β βββ c23
β β β β βββ videos
β β βββ FaceShifter
β β β βββ c23
β β β β βββ videos
β β βββ DeepFakeDetection
β β βββ c23
β β β βββ videos
β
βββ Celeb-DF-v1/v2
β βββ Celeb-synthesis
β β βββ videos
β βββ Celeb-real
β β βββ videos
β βββ YouTube-real
β βββ videos
β
βββ DFDCP
β βββ method_A
β βββ method_B
β βββ original_videos
β
βββ DeeperForensics-1.0
β βββ manipulated_videos
β βββ source_videos
β
βββ ...
If you choose to store your datasets in a different folder, for instance, ./deepfake/data
, it's important to reflect this change in the dataset path in the config.yaml for preprocessing purposes.
For the preprocessing module, we mainly provide two scripts: preprocessing and arrangement.
- The preprocessing script in DeepfakeBench follows a sequential workflow for face detection, alignment, and cropping. The processed data, including face images, landmarks, and masks, are saved in separate folders for further analysis.
- The rearrangement script simplifies the handling of different datasets by providing a unified and convenient way to load them. The function eliminates the need to write separate input/output (I/O) code for each dataset, reducing duplication of effort and easing data management.
To start preprocessing your dataset, please follow these steps:
-
Download the shape_predictor_81_face_landmarks.dat file. Then, copy the downloaded shape_predictor_81_face_landmarks.dat file into the
./preprocessing/dlib_tools
folder. This file is necessary for Dlib's face detection functionality. -
Open the
./preprocessing/config.yaml
and locate the linedefault: DATASET_YOU_SPECIFY
. ReplaceDATASET_YOU_SPECIFY
with the name of the dataset you want to preprocess, such asFaceForensics++
. -
Specify the
dataset_root_path
in the config.yaml file. Search for the line that mentions dataset_root_path. By default, it looks like this:dataset_root_path: ./datasets
. Replace./datasets
with the actual path to the folder where your dataset is arranged.
Once you have completed these steps, you can proceed with running the following line to do the preprocessing:
cd preprocessing
python preprocess.py
Second, after the preprocessing above, you will obtain the processed data for each dataset you specify. Similarly, you need to set the parameters in config.yaml for each dataset. After that, run the following line:
cd preprocessing
python rearrange.py
After running the above line, you will obtain the JSON files for each dataset in the ./preprocessing/dataset_json
folder. The rearranged structure organizes the data in a hierarchical manner, grouping videos based on their labels and data splits (i.e., train, test, validation). Each video is represented as a dictionary entry containing relevant metadata, including file paths, labels, compression levels (if applicable), etc.
To run the training code, you should first download the pretrained weights for the corresponding backbones. You can download them from Link. After downloading, you need to put all the weights files into the folder ./training/pretrained/
.
You should first go to the ./training/config/detector/
folder and then Choose the detector to be trained. For instance, you can adjust the parameters in xception.yaml
to specify the parameters, e.g., training and testing datasets, epoch, frame_num, etc.
After setting the parameters, you can run with the following to train the Xception detector:
cd training
python train.py \
--detector_path ./config/detector/xception.yaml
You can also adjust the training and testing parameters using the command line, for example:
cd training
python train.py \
--detector_path ./config/detector/xception.yaml \
--train_dataset FaceForensics++ --testing_dataset Celeb-DF-v1
By default, the checkpoints and features will be saved during the training process. If you do not want to save them, run with the following:
cd training
python train.py \
--detector_path ./config/detector/xception.yaml \
--train_dataset FaceForensics++ --testing_dataset Celeb-DF-v1 \
--no-save_ckpt \
--no-save_feat
To train other detectors using the code mentioned above, you can specify the config file accordingly. However, for the Face X-ray detector, an additional step is required before training. To save training time, a pickle file is generated to store the Top-N nearest images for each given image. To generate this file, you should run the generate_xray_nearest.py
file. Once the pickle file is created, you can train the Face X-ray detector using the same way above.
In our Benchmark, we apply TensorBoard to monitor the progress of training models. It provides a visual representation of the training process, allowing users to examine training results conveniently.
To demonstrate the effectiveness of different detectors, we present partial results from both within-domain and cross-domain evaluations. The evaluation metric used is the Area Under the Curve (AUC). In this particular scenario, we train the detectors on the FF++ (c23) dataset and assess their performance on other datasets.
For a comprehensive overview of the results, we strongly recommend referring to our main paper and supplementary materials. These resources provide a detailed analysis of the training outcomes and offer a deeper understanding of the methodology and findings.
Type | Detector | Backbone | FF++_c23 | FF++_c40 | FF-DF | FF-F2F | FF-FS | FF-NT | Avg. | Top3 | CDFv1 | CDFv2 | DF-1.0 | DFD | DFDC | DFDCP | Fsh | UADFV | Avg. | Top3 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Naive | Meso4 | MesoNet | 0.6077 | 0.5920 | 0.6771 | 0.6170 | 0.5946 | 0.5701 | 0.6097 | 0 | 0.7358 | 0.6091 | 0.9113 | 0.5481 | 0.5560 | 0.5994 | 0.5660 | 0.7150 | 0.6551 | 1 |
Naive | MesoIncep | MesoNet | 0.7583 | 0.7278 | 0.8542 | 0.8087 | 0.7421 | 0.6517 | 0.7571 | 0 | 0.7366 | 0.6966 | 0.9233 | 0.6069 | 0.6226 | 0.7561 | 0.6438 | 0.9049 | 0.7364 | 3 |
Naive | CNN-Aug | ResNet | 0.8493 | 0.7846 | 0.9048 | 0.8788 | 0.9026 | 0.7313 | 0.8419 | 0 | 0.7420 | 0.7027 | 0.7993 | 0.6464 | 0.6361 | 0.6170 | 0.5985 | 0.8739 | 0.7020 | 0 |
Naive | Xception | Xception | 0.9637 | 0.8261 | 0.9799 | 0.9785 | 0.9833 | 0.9385 | 0.9450 | 4 | 0.7794 | 0.7365 | 0.8341 | 0.8163 | 0.7077 | 0.7374 | 0.6249 | 0.9379 | 0.7718 | 2 |
Naive | EfficientB4 | Efficient | 0.9567 | 0.8150 | 0.9757 | 0.9758 | 0.9797 | 0.9308 | 0.9389 | 0 | 0.7909 | 0.7487 | 0.8330 | 0.8148 | 0.6955 | 0.7283 | 0.6162 | 0.9472 | 0.7718 | 3 |
Spatial | Capsule | Capsule | 0.8421 | 0.7040 | 0.8669 | 0.8634 | 0.8734 | 0.7804 | 0.8217 | 0 | 0.7909 | 0.7472 | 0.9107 | 0.6841 | 0.6465 | 0.6568 | 0.6465 | 0.9078 | 0.7488 | 2 |
Spatial | FWA | Xception | 0.8765 | 0.7357 | 0.9210 | 0.9000 | 0.8843 | 0.8120 | 0.8549 | 0 | 0.7897 | 0.6680 | 0.9334 | 0.7403 | 0.6132 | 0.6375 | 0.5551 | 0.8539 | 0.7239 | 1 |
Spatial | Face X-ray | HRNet | 0.9592 | 0.7925 | 0.9794 | 0.9872 | 0.9871 | 0.9290 | 0.9391 | 3 | 0.7093 | 0.6786 | 0.5531 | 0.7655 | 0.6326 | 0.6942 | 0.6553 | 0.8989 | 0.6985 | 0 |
Spatial | FFD | Xception | 0.9624 | 0.8237 | 0.9803 | 0.9784 | 0.9853 | 0.9306 | 0.9434 | 1 | 0.7840 | 0.7435 | 0.8609 | 0.8024 | 0.7029 | 0.7426 | 0.6056 | 0.9450 | 0.7733 | 1 |
Spatial | CORE | Xception | 0.9638 | 0.8194 | 0.9787 | 0.9803 | 0.9823 | 0.9339 | 0.9431 | 2 | 0.7798 | 0.7428 | 0.8475 | 0.8018 | 0.7049 | 0.7341 | 0.6032 | 0.9412 | 0.7694 | 0 |
Spatial | Recce | Designed | 0.9621 | 0.8190 | 0.9797 | 0.9779 | 0.9785 | 0.9357 | 0.9422 | 1 | 0.7677 | 0.7319 | 0.7985 | 0.8119 | 0.7133 | 0.7419 | 0.6095 | 0.9446 | 0.7649 | 2 |
Spatial | UCF | Xception | 0.9705 | 0.8399 | 0.9883 | 0.9840 | 0.9896 | 0.9441 | 0.9527 | 6 | 0.7793 | 0.7527 | 0.8241 | 0.8074 | 0.7191 | 0.7594 | 0.6462 | 0.9528 | 0.7801 | 5 |
Frequency | F3Net | Xception | 0.9635 | 0.8271 | 0.9793 | 0.9796 | 0.9844 | 0.9354 | 0.9449 | 1 | 0.7769 | 0.7352 | 0.8431 | 0.7975 | 0.7021 | 0.7354 | 0.5914 | 0.9347 | 0.7645 | 0 |
Frequency | SPSL | Xception | 0.9610 | 0.8174 | 0.9781 | 0.9754 | 0.9829 | 0.9299 | 0.9408 | 0 | 0.8150 | 0.7650 | 0.8767 | 0.8122 | 0.7040 | 0.7408 | 0.6437 | 0.9424 | 0.7875 | 3 |
Frequency | SRM | Xception | 0.9576 | 0.8114 | 0.9733 | 0.9696 | 0.9740 | 0.9295 | 0.9359 | 0 | 0.7926 | 0.7552 | 0.8638 | 0.8120 | 0.6995 | 0.7408 | 0.6014 | 0.9427 | 0.7760 | 2 |
In the above table, "Avg." donates the average AUC for within-domain and cross-domain evaluation, and the overall results. "Top3" represents the count of each method ranks within the top-3 across all testing datasets. The best-performing method for each column is highlighted.
Also, we provide all experimental results in Link (code: qjpd). You can use these results for further analysis using the code in ./analysis
folder. You can run these codes to reproduce the results in our original paper.
If you find our benchmark useful to your research, please cite it as follows:
@article{yan2023deepfakebench,
title={DeepfakeBench: A Comprehensive Benchmark of Deepfake Detection},
author={Yan, Zhiyuan and Zhang, Yong and Yuan, Xinhang and Lyu, Siwei and Wu, Baoyuan},
journal={arXiv preprint arXiv:2307.01426},
year={2023}
}
If interested, you can read our recent works about deepfake detection, and more works about trustworthy AI can be found here.
@article{yan2023ucf,
title={UCF: Uncovering Common Features for Generalizable Deepfake Detection},
author={Yan, Zhiyuan and Zhang, Yong and Fan, Yanbo and Wu, Baoyuan},
journal={arXiv preprint arXiv:2304.13949},
year={2023}
}
This repository is licensed by The Chinese University of Hong Kong, Shenzhen under Creative Commons Attribution-NonCommercial 4.0 International Public License (identified as CC BY-NC-4.0 in SPDX). More details about the license could be found in LICENSE.
This project is built by the Secure Computing Lab of Big Data (SCLBD) at The School of Data Science (SDS) of The Chinese University of Hong Kong, Shenzhen, directed by Professor Baoyuan Wu. SCLBD focuses on the research of trustworthy AI, including backdoor learning, adversarial examples, federated learning, fairness, etc.
If you have any suggestions, comments, or wish to contribute code or propose methods, we warmly welcome your input. Please contact us at wubaoyuan@cuhk.edu.cn or yanzhiyuan1114@gmail.com. We look forward to collaborating with you in pushing the boundaries of deepfake detection.