Enhancing General Face Forgery Detection via Vision Transformer with Low-Rank Adaptation

Chenqi Kong; Haoliang Li; Shiqi Wang

低ランク適応によるビジョントランスフォーマーによる一般的な顔偽造検出の強化

現在、偽造された顔は、フェイクニュース、詐欺、なりすましなどに対する差し迫ったセキュリティ上の懸念を引き起こしています。ドメイン内の偽造顔検出の実証済みの成功にもかかわらず、既存の検出方法には一般化機能がなく、予期しないドメインに展開すると劇的なパフォーマンス低下に悩まされる傾向があります。この問題を軽減するために、このホワイトペーパーでは、ビジョントランスフォーマー (ViT) アーキテクチャに基づく、より一般的な偽の顔検出モデルを設計します。トレーニングフェーズでは、事前トレーニング済みの ViT 重みが凍結され、低ランク適応 (LoRA) モジュールのみが更新されます。さらに、Single Center Loss (SCL) を適用してトレーニングプロセスを監視し、モデルの一般化機能をさらに向上させます。提案された方法は、クロスマニピュレーションとクロスデータセット評価の両方で最先端の検出性能を達成します。

Nowadays, forgery faces pose pressing security concerns over fake news, fraud, impersonation, etc. Despite the demonstrated success in intra-domain face forgery detection, existing detection methods lack generalization capability and tend to suffer from dramatic performance drops when deployed to unforeseen domains. To mitigate this issue, this paper designs a more general fake face detection model based on the vision transformer(ViT) architecture. In the training phase, the pretrained ViT weights are freezed, and only the Low-Rank Adaptation(LoRA) modules are updated. Additionally, the Single Center Loss(SCL) is applied to supervise the training process, further improving the generalization capability of the model. The proposed method achieves state-of-the-arts detection performances in both cross-manipulation and cross-dataset evaluations.

updated: Mon Mar 27 2023 07:42:24 GMT+0000 (UTC)

published: Thu Mar 02 2023 02:26:04 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト