One-Shot Learning for Periocular Recognition: Exploring the Effect of Domain Adaptation and Data Bias on Deep Representations

Kevin Hernandez-Diaz; Fernando Alonso-Fernandez; Josef Bigun

眼周囲認識のためのワンショット学習: 深い表現に対するドメイン適応とデータバイアスの影響を調査する

機械学習アルゴリズムの弱点の 1 つは、新しいタスクのためにモデルをトレーニングする必要があることです。これは、データベースの動的な性質と、場合によってはデータ収集のために被験者の協力に依存するため、生体認証に特有の課題をもたらします。この論文では、生体認証認識タスクであるワンショット眼周囲認識の極度のデータ不足下で、広く使用されている CNN モデルの深層表現の動作を調査します。 CNN 層の出力をアイデンティティを表す特徴ベクトルとして分析します。目に見えないデータに対するネットワーク層の出力に対するドメイン適応の影響を調べ、データの正規化と最もパフォーマンスの高い層の一般化に関する方法の堅牢性を評価します。何百万もの画像を含む生体認証データセットでトレーニングされたネットワークを利用し、ImageNet Recognition Challenge および標準向けにトレーニングされたすぐに使用できる CNN を利用してターゲットの眼周囲データセットに合わせて微調整された、最先端の結果を改善しました。コンピュータービジョンアルゴリズム。たとえば、Cross-Eyed データセットの場合、Close-World プロトコルと Open-World プロトコルで EER をそれぞれ 67% と 79% (1.70% と 3.41% から 0.56% と 0.71%) 削減できます。眼周囲のケース。また、データが限られている状況や、オープンワールドモードなどのテストクラスでネットワークがトレーニングされていないシナリオでは、SIFT などの従来のアルゴリズムが CNN よりも優れたパフォーマンスを発揮できることも実証します。 SIFT のみでは、Close-World プロトコルと Open-World プロトコルの寄り目で EER をそれぞれ 64% と 71.6% (1.7% と 3.41% から 0.6% と 0.97%) 削減でき、4.6 の削減に成功しました。オープンワールドおよび単一の生体認証の場合の PolyU データベース内の % (3.94% から 3.76%)。

One weakness of machine-learning algorithms is the need to train the models for a new task. This presents a specific challenge for biometric recognition due to the dynamic nature of databases and, in some instances, the reliance on subject collaboration for data collection. In this paper, we investigate the behavior of deep representations in widely used CNN models under extreme data scarcity for One-Shot periocular recognition, a biometric recognition task. We analyze the outputs of CNN layers as identity-representing feature vectors. We examine the impact of Domain Adaptation on the network layers' output for unseen data and evaluate the method's robustness concerning data normalization and generalization of the best-performing layer. We improved state-of-the-art results that made use of networks trained with biometric datasets with millions of images and fine-tuned for the target periocular dataset by utilizing out-of-the-box CNNs trained for the ImageNet Recognition Challenge and standard computer vision algorithms. For example, for the Cross-Eyed dataset, we could reduce the EER by 67% and 79% (from 1.70% and 3.41% to 0.56% and 0.71%) in the Close-World and Open-World protocols, respectively, for the periocular case. We also demonstrate that traditional algorithms like SIFT can outperform CNNs in situations with limited data or scenarios where the network has not been trained with the test classes like the Open-World mode. SIFT alone was able to reduce the EER by 64% and 71.6% (from 1.7% and 3.41% to 0.6% and 0.97%) for Cross-Eyed in the Close-World and Open-World protocols, respectively, and a reduction of 4.6% (from 3.94% to 3.76%) in the PolyU database for the Open-World and single biometric case.

updated: Tue Jul 11 2023 09:10:16 GMT+0000 (UTC)

published: Tue Jul 11 2023 09:10:16 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト