🎉

INTERSPEECH’26] One paper has been accepted!

One paper has been accepted to INTERSPEECH 2026

Physics-Aware Deepfake Detection via Distance–Speech Consistency

Authors: Kyeongrae Kim (KAIST), Kim Sung-Bin (POSTECH), Oh Hyun-Bin (POSTECH), Tae-Hyun Oh (KAIST)

Recent progress in generative models has made talking-head deepfakes highly realistic and easy to create, jeopardizing public trust. However, most audio-visual deepfake detectors rely on lip–speech synchronization and are designed for static, frontal videos, limiting their reliability in dynamic, in-the-wild recordings where the speaker moves and lip cues are degraded or missing. As an alternative, we propose a physics-aware detector for dynamic speaking videos that leverages an acoustic constraint: in real recordings, speech energy varies predictably with speaker-to-camera distance, whereas deepfake generation can break this coupling. Our method combines distance estimates from video with distance-dependent acoustic measures from speech to detect deepfakes. Experiments on in-the-wild benchmarks show that the proposed physical cue is effective in dynamic scenarios, and that it complements lip-sync-based detectors, with a simple ensemble achieving consistent gains across datasets.