UTokyo’s New Self-Blending Image Approach Achieves SOTA Results in Deepfake Detection

AI-powered facial image generations and manipulations have flooded the internet in recent years, a product of the ever-increasing power of Generative Adversarial Networks (GANs). While apps such as Face Aging, Sentiment Editing, and Style Transfer may offer a bit of harmless fun to users, advanced image generation techniques have also been maliciously used to create news stories. deepfake news, face swap porn, and more. today’s state-of-the-art image generation models are so realistic that they can fool most humans, and some also fool state-of-the-art deepfake detection models, especially when images are produced through unknown manipulation techniques.

University of Tokyo Research Team Addresses Pressing Challenge of Detecting Deepfakes in New Paper Detect deepfakes with auto-mixed imagesoffering self-blending images (SBI), a new approach to synthetic training data that outperforms state-of-the-art methods on manipulations and unseen scenes for deepfake detection tasks.

The researchers’ goal is to detect statistical inconsistencies between transferred faces and background image information in deepfakes. Their SBI approach is based on the premise that more general and hardly recognizable dummy samples will encourage classifiers to learn more generic and robust representations. SBI blends pseudo-source and target images from a single image to generate synthetic fake samples that include hard-to-detect common counterfeit traces. These samples can then be used to train detectors that exhibit better robustness and generalization performance.

The SBI pipeline consists of three main steps: 1) A source-target generator first generates pseudo-source and target images which will then be used for blending, 2) A mask generator then generates a layered warp mask image of gray, and 3) The source and target images are blended with the mask to obtain an SBI.

In their empirical study, the team compared their SBI approach with state-of-the-art frame-level detection methods DSPFWA, Face X-ray, Local relation learning (LRL), Fusion + RSA + DCMA + Multi-scale (FRDM ), and Pairwise Self-Consistency Learning (PCL) on FF++, CDF, DFD, DFDC, DFDCP, and FFIW datasets. They also evaluated their model against video-level baselines such as Discriminant Attention Models (DAM) and Full-Time Convolutional Networks (FTCN).

In experiments, the SBI approach outperformed benchmarks by 4.90% and 11.78% in the cross-assessment of datasets on DFDC and DFDCP datasets, respectively. Overall, the study shows that the SBI synthetic training data schema outperforms state-of-the-art methods on unseen deepfake manipulations and scenes; and can generalize to all network architectures and training datasets without a significant drop in performance.

The code is available on the project’s GitHub. The paper Detect deepfakes with auto-mixed images is on arXiv.

Author: Hecate He | Editor: Michel Sarazen

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Weekly Synchronized Global AI to get weekly AI updates.

Previous Data literacy is our best weapon against fake satellite images
Next Twin Blends Preserves Shreveport's History in Stunning "Then and Now" Images