Detecting and Recovering Sequential DeepFake Manipulation

Rui Shao; Tianxing Wu; Ziwei Liu

シーケンシャルDeepFake操作の検出と回復

今日、写実的な顔は顔の操作技術によって容易に生成できるため、これらの技術の潜在的な悪意のある乱用は大きな懸念を引き起こしています。したがって、多数のディープフェイク検出方法が提案されています。ただし、既存の方法は、ワンステップの顔の操作の検出にのみ焦点を当てています。簡単にアクセスできる顔の編集アプリケーションの出現により、人々は多段階の操作を連続して使用して顔のコンポーネントを簡単に操作できるようになりました。この新しい脅威では、一連の顔の操作を検出する必要があります。これは、ディープフェイクメディアの検出と、その後の元の顔の復元の両方に不可欠です。この観察に動機付けられて、私たちはその必要性を強調し、シーケンシャルディープフェイク操作の検出（Seq-DeepFake）と呼ばれる新しい研究問題を提案します。バイナリラベル予測のみを要求する既存のディープフェイク検出タスクとは異なり、Seq-DeepFake操作を検出するには、顔操作操作の順次ベクトルを正しく予測する必要があります。大規模な調査をサポートするために、最初のSeq-DeepFakeデータセットを構築します。このデータセットでは、顔の画像が、連続する顔の操作ベクトルの対応する注釈を使用して順次操作されます。この新しいデータセットに基づいて、Seq-DeepFake操作の検出を特定の画像からシーケンス（画像のキャプションなど）タスクとしてキャストし、簡潔で効果的なSeq-DeepFakeトランスフォーマー（SeqFakeFormer）を提案します。さらに、包括的なベンチマークを構築し、この新しい研究問題に対する厳密な評価プロトコルとメトリックを設定します。広範な実験により、SeqFakeFormerの有効性が実証されています。より広範なディープフェイク検出問題の将来の研究を容易にするために、いくつかの貴重な観察結果も明らかにされています。

Since photorealistic faces can be readily generated by facial manipulation technologies nowadays, potential malicious abuse of these technologies has drawn great concerns. Numerous deepfake detection methods are thus proposed. However, existing methods only focus on detecting one-step facial manipulation. As the emergence of easy-accessible facial editing applications, people can easily manipulate facial components using multi-step operations in a sequential manner. This new threat requires us to detect a sequence of facial manipulations, which is vital for both detecting deepfake media and recovering original faces afterwards. Motivated by this observation, we emphasize the need and propose a novel research problem called Detecting Sequential DeepFake Manipulation (Seq-DeepFake). Unlike the existing deepfake detection task only demanding a binary label prediction, detecting Seq-DeepFake manipulation requires correctly predicting a sequential vector of facial manipulation operations. To support a large-scale investigation, we construct the first Seq-DeepFake dataset, where face images are manipulated sequentially with corresponding annotations of sequential facial manipulation vectors. Based on this new dataset, we cast detecting Seq-DeepFake manipulation as a specific image-to-sequence (e.g. image captioning) task and propose a concise yet effective Seq-DeepFake Transformer (SeqFakeFormer). Moreover, we build a comprehensive benchmark and set up rigorous evaluation protocols and metrics for this new research problem. Extensive experiments demonstrate the effectiveness of SeqFakeFormer. Several valuable observations are also revealed to facilitate future research in broader deepfake detection problems.

updated: Tue Jul 05 2022 17:59:33 GMT+0000 (UTC)

published: Tue Jul 05 2022 17:59:33 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト