Taming Self-Supervised Learning for Presentation Attack Detection: In-Image De-Folding and Out-of-Image De-Mixing

Haozhe Liu; Zhe Kong; Raghavendra Ramachandra; Feng Liu; Linlin Shen; Christoph Busch

プレゼンテーション攻撃検出のための自己教師あり学習の習得：画像内デフォールディングと画像外デミキシング

生体認証システムは、さまざまなプレゼンテーション攻撃機器（PAI）を使用して実行されるプレゼンテーション攻撃（PA）に対して脆弱です。ディープラーニングと手作りの機能の両方に基づく多数のプレゼンテーション攻撃検出（PAD）手法がありますが、未知のPAIに対するPADの一般化は依然として困難な問題です。既存の深層学習ベースのPAD手法に共通する問題は、局所的な最適化に苦労し、さまざまなPAに対する一般化が弱くなる可能性があることです。この作業では、自己監視学習を使用して、ローカルトラップに対する合理的な初期化を見つけ、生体認証システムでPAを検出する際の一般化能力を向上させることを提案します。IF-OMと呼ばれる提案された方法は、グローバルローカルビューをデフォールディングおよびデミキシングと組み合わせて、PADのタスク固有の表現を導出します。デフォールディング中に、提案された手法は、サイクルの一貫性を明示的に最大化することにより、ローカルパターンでサンプルを表す領域固有の機能を学習します。一方、De-Mixingは検出器を駆動して、トポロジの一貫性を最大化することにより、より包括的な表現のためのグローバル情報を含むインスタンス固有の機能を取得します。広範な実験結果は、提案された方法が、最先端の方法と比較した場合、より複雑でハイブリッドなデータセットにおいて、顔と指紋の両方のPADに関して大幅な改善を達成できることを示しています。具体的には、CASIA-FASDとIdiap Replay-Attackでトレーニングすると、提案された方法でOULU-NPUとMSU-MFSDで18.60％の等エラー率（EER）を達成でき、ベースラインパフォーマンスを9.54％上回ります。コードは公開されます。

Biometric systems are vulnerable to the Presentation Attacks (PA) performed using various Presentation Attack Instruments (PAIs). Even though there are numerous Presentation Attack Detection (PAD) techniques based on both deep learning and hand-crafted features, the generalization of PAD for unknown PAI is still a challenging problem. The common problem with existing deep learning-based PAD techniques is that they may struggle with local optima, resulting in weak generalization against different PAs. In this work, we propose to use self-supervised learning to find a reasonable initialization against local trap, so as to improve the generalization ability in detecting PAs on the biometric system.The proposed method, denoted as IF-OM, is based on a global-local view coupled with De-Folding and De-Mixing to derive the task-specific representation for PAD.During De-Folding, the proposed technique will learn region-specific features to represent samples in a local pattern by explicitly maximizing cycle consistency. While, De-Mixing drives detectors to obtain the instance-specific features with global information for more comprehensive representation by maximizing topological consistency. Extensive experimental results show that the proposed method can achieve significant improvements in terms of both face and fingerprint PAD in more complicated and hybrid datasets, when compared with the state-of-the-art methods. Specifically, when training in CASIA-FASD and Idiap Replay-Attack, the proposed method can achieve 18.60% Equal Error Rate (EER) in OULU-NPU and MSU-MFSD, exceeding baseline performance by 9.54%. Code will be made publicly available.

updated: Thu Sep 09 2021 08:38:17 GMT+0000 (UTC)

published: Thu Sep 09 2021 08:38:17 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト