Enhancing Mobile Face Anti-Spoofing: A Robust Framework for Diverse Attack Types under Screen Flash

Weihua Liu; Chaochao Lin; Yu Yan

モバイルフェイスのアンチスプーフィングの強化: スクリーンフラッシュ下の多様な攻撃タイプに対応する堅牢なフレームワーク

顔のアンチスプーフィング (FAS) は、顔認識システムを保護するために重要です。ただし、手作りのバイナリまたはピクセル単位のラベルを使用する既存の FAS 手法には、多様なプレゼンテーション攻撃 (PA) による制限があります。本稿では、ATR-FASと呼ばれる光フラッシュ下での攻撃型の堅牢な顔になりすまし防止フレームワークを提案します。さまざまな攻撃タイプによって引き起こされるイメージングの違いにより、単一のバイナリ分類ネットワークに基づく従来の FAS 手法では、なりすまし顔のクラス内距離が過度に大きくなり、決定境界学習の課題につながる可能性があります。したがって、補助監視として複数のネットワークを使用してマルチフレーム深度マップを再構築し、各ネットワークが 1 種類の攻撃を専門としています。タイプゲートとフレームアテンションゲートで構成されるデュアルゲートモジュール (DGM) が導入され、それぞれ攻撃タイプの認識とマルチフレームアテンションの生成を実行します。 DGM の出力は、複数のエキスパートネットワークの結果を混合するための重みとして利用されます。複数の専門家の混合により、ATR-FAS はスプーフィングされた深度マップを生成し、さまざまな種類の PA の影響を受けることなく安定してスプーフィング顔を検出できます。さらに、オリジナルのフラッシュフレームを差分フレームに変換する差分正規化手順を設計します。このシンプルだが効果的な処理により、フラッシュフレームのディテールが強調され、深度マップの生成に役立ちます。フレームワークの有効性を検証するために、スマートフォン画面からのダイナミックフラッシュの下で、さまざまな PA を備えた 12,660 のライブビデオとなりすましビデオを含む大規模なデータセットを収集しました。広範な実験により、提案された ATR-FAS が既存の最先端の方法よりも大幅に優れていることが示されています。コードとデータセットは https://github.com/Chaochao-Lin/ATR-FAS で入手できます。

Face anti-spoofing (FAS) is crucial for securing face recognition systems. However, existing FAS methods with handcrafted binary or pixel-wise labels have limitations due to diverse presentation attacks (PAs). In this paper, we propose an attack type robust face anti-spoofing framework under light flash, called ATR-FAS. Due to imaging differences caused by various attack types, traditional FAS methods based on single binary classification network may result in excessive intra-class distance of spoof faces, leading to a challenge of decision boundary learning. Therefore, we employed multiple networks to reconstruct multi-frame depth maps as auxiliary supervision, and each network experts in one type of attack. A dual gate module (DGM) consisting of a type gate and a frame-attention gate is introduced, which perform attack type recognition and multi-frame attention generation, respectively. The outputs of DGM are utilized as weight to mix the result of multiple expert networks. The multi-experts mixture enables ATR-FAS to generate spoof-differentiated depth maps, and stably detects spoof faces without being affected by different types of PAs. Moreover, we design a differential normalization procedure to convert original flash frames into differential frames. This simple but effective processing enhances the details in flash frames, aiding in the generation of depth maps. To verify the effectiveness of our framework, we collected a large-scale dataset containing 12,660 live and spoof videos with diverse PAs under dynamic flash from the smartphone screen. Extensive experiments illustrate that the proposed ATR-FAS significantly outperforms existing state-of-the-art methods. The code and dataset will be available at https://github.com/Chaochao-Lin/ATR-FAS.

updated: Tue Aug 29 2023 14:41:40 GMT+0000 (UTC)

published: Tue Aug 29 2023 14:41:40 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト