Undercover Deepfakes: Detecting Fake Segments in Videos

Sanjay Saha; Rashindrie Perera; Sachith Seneviratne; Tamasha Malepathirana; Sanka Rasnayaka; Deshani Geethika; Terence Sim; Saman Halgamuge

潜入ディープフェイク: 動画内の偽のセグメントを検出

主に拡散モデルの出現と GAN 手法の反復的改善によって推進された最近の生成モデルの復活により、多くの創造的なアプリケーションが可能になりました。しかし、進歩するたびに、悪用の可能性も高まります。ディープフェイク世代の分野では、これは重要な社会問題です。特に、このような生成技術を使用してビデオのセグメントを変更できる機能は、真実を歪めるためにわずかに変更された実際のビデオであるディープフェイクの新しいパラダイムを生み出します。このパラダイムは、学術文献における現在のディープフェイク検出方法では十分に調査されていません。この論文では、フレームレベルとビデオレベルでディープフェイク予測を実行することで、この問題に対処できるディープフェイク検出方法を紹介します。私たちの方法のテストを容易にするために、ビデオに非常に微妙なトランジションを持つ本物と偽のフレームシーケンスの両方が含まれる新しいベンチマークデータセットを準備しました。スケーリングとシフトに基づくビジョントランスフォーマーを利用して空間的特徴を学習し、タイムシリーズトランスフォーマーを利用してビデオの時間的特徴を学習して、ディープフェイクの可能性の解釈を容易にする検出方法を使用して、提案されたデータセットのベンチマークを提供します。さまざまなディープフェイク生成方法に関する広範な実験により、提案された方法による時間セグメンテーションと古典的なビデオレベルの予測でも優れた結果が示されました。特に、私たちが取り組むパラダイムは、ディープフェイクを管理するための強力なツールを形成し、ディープフェイクの疑いがあるビデオの部分を人間による監視の対象とすることができます。すべての実験は https://t.ly/\_bOh9 で再現できます。

The recent renaissance in generative models, driven primarily by the advent of diffusion models and iterative improvement in GAN methods, has enabled many creative applications. However, each advancement is also accompanied by a rise in the potential for misuse. In the arena of the deepfake generation, this is a key societal issue. In particular, the ability to modify segments of videos using such generative techniques creates a new paradigm of deepfakes which are mostly real videos altered slightly to distort the truth.This paradigm has been under-explored by the current deepfake detection methods in the academic literature. In this paper, we present a deepfake detection method that can address this issue by performing deepfake prediction at the frame and video levels. To facilitate testing our method, we prepared a new benchmark dataset where videos have both real and fake frame sequences with very subtle transitions. We provide a benchmark on the proposed dataset with our detection method which utilizes the Vision Transformer based on Scaling and Shifting to learn spatial features, and a Timeseries Transformer to learn temporal features of the videos to help facilitate the interpretation of possible deepfakes. Extensive experiments on a variety of deepfake generation methods show excellent results by the proposed method on temporal segmentation and classical video-level predictions as well. In particular, the paradigm we address will form a powerful tool for the moderation of deepfakes, where human oversight can be better targeted to the parts of videos suspected of being deepfakes. All experiments can be reproduced at: https://t.ly/\_bOh9.

updated: Fri Aug 11 2023 04:17:56 GMT+0000 (UTC)

published: Thu May 11 2023 04:43:10 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト