Vision Transformer Based Video Hashing Retrieval for Tracing the Source of Fake Videos

Pengfei Pei; Xianfeng Zhao; Yun Cao; Jinchuan Li; Xuyuan Lai

偽のビデオのソースを追跡するための Vision Transformer ベースのビデオハッシング検索

近年、フェイク動画の拡散は、個人はもとより国にも大きな影響を与えています。偽の動画については、堅牢で信頼できる結果を提供することが重要です。従来の検出方法の結果は、目に見えないビデオに対して信頼性が低く、堅牢ではありません。別のより効果的な方法は、偽のビデオの元のビデオを見つけることです。たとえば、ロシアとウクライナの戦争や香港の法改正の嵐のフェイク動画は、元の動画を見つけて反駁しています。改善された検索方法を使用して、ViTHash という名前の元のビデオを見つけます。具体的には、偽のビデオのソースを追跡するには、一意のものを見つける必要があります。これは、元のビデオにわずかな違いしかない場合は困難です.上記の問題を解決するために、新しい損失 Hash Triplet Loss を設計しました。さらに、元のトレースされたビデオと偽のビデオの違いを比較する Localizator というツールを設計しました。 FaceForensics++、Celeb-DF、DeepFakeDetection について広範な実験を行いました。また、構築した 3 つのデータセット (DAVIS2016-TL (ビデオ修復)、VSTL (ビデオスプライシング)、および DFTL (同様のビデオ)) に対して追加の実験を行いました。実験では、特にクロスデータセットモードで、パフォーマンスが最先端の方法よりも優れていることが示されています。実験では、ViTHash がさまざまな偽造検出 (ビデオ修復、ビデオスプライシング、ディープフェイク) に効果的であることも実証されました。私たちのコードとデータセットは GitHub でリリースされています: https://github.com/lajlksdf/vtl.

In recent years, the spread of fake videos has brought great influence on individuals and even countries. It is important to provide robust and reliable results for fake videos. The results of conventional detection methods are not reliable and not robust for unseen videos. Another alternative and more effective way is to find the original video of the fake video. For example, fake videos from the Russia-Ukraine war and the Hong Kong law revision storm are refuted by finding the original video. We use an improved retrieval method to find the original video, named ViTHash. Specifically, tracing the source of fake videos requires finding the unique one, which is difficult when there are only small differences in the original videos. To solve the above problems, we designed a novel loss Hash Triplet Loss. In addition, we designed a tool called Localizator to compare the difference between the original traced video and the fake video. We have done extensive experiments on FaceForensics++, Celeb-DF and DeepFakeDetection, and we also have done additional experiments on our built three datasets: DAVIS2016-TL (video inpainting), VSTL (video splicing) and DFTL (similar videos). Experiments have shown that our performance is better than state-of-the-art methods, especially in cross-dataset mode. Experiments also demonstrated that ViTHash is effective in various forgery detection: video inpainting, video splicing and deepfakes. Our code and datasets have been released on GitHub: https://github.com/lajlksdf/vtl.

updated: Tue Sep 06 2022 04:17:44 GMT+0000 (UTC)

published: Wed Dec 15 2021 13:35:55 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト