AFR-Net: Attention-Driven Fingerprint Recognition Network

Steven A. Grosz; Anil K. Jain

AFR-Net: 注意主導の指紋認識ネットワーク

コンピュータビジョンでのビジョントランスフォーマー (ViT) の使用は、誘導バイアス (局所性、重みの共有など) が制限されており、他のディープラーニング手法と比較して拡張性が高いため、増加しています。これにより、指紋認識を含むバイオメトリック認識に ViT を使用するいくつかの初期研究が行われました。この作業では、i.) 追加の注意ベースのアーキテクチャを評価すること、ii.) より大規模で多様なトレーニングおよび評価データセットにスケーリングすること、および iii.) 注意の補完的な表現を組み合わせることによって、指紋認識のトランスフォーマーに関するこれらの初期の研究を改善します。改良された最先端 (SOTA) の指紋認識 (認証と識別の両方) のためのベースおよび CNN ベースの埋め込み。当社の複合アーキテクチャである AFR-Net (Attention-Driven Fingerprint Recognition Network) は、SOTA 商用指紋システム、Verifinger v12.3 を含むいくつかのベースライントランスフォーマーおよび CNN ベースのモデルよりも優れています。ローリングされた指紋照合データセット。さらに、ネットワーク内の中間特徴マップから抽出されたローカル埋め込みを使用して、確実性の低い状況でグローバル埋め込みを改良する再調整戦略を提案します。これにより、各モデル全体で全体的な認識精度が大幅に向上します。この再編成戦略は、追加のトレーニングを必要とせず、既存の深層学習ネットワーク (注意ベース、CNN ベース、またはその両方を含む) にラッパーとして適用して、そのパフォーマンスを向上させることができます。

The use of vision transformers (ViT) in computer vision is increasing due to limited inductive biases (e.g., locality, weight sharing, etc.) and increased scalability compared to other deep learning methods. This has led to some initial studies on the use of ViT for biometric recognition, including fingerprint recognition. In this work, we improve on these initial studies for transformers in fingerprint recognition by i.) evaluating additional attention-based architectures, ii.) scaling to larger and more diverse training and evaluation datasets, and iii.) combining the complimentary representations of attention-based and CNN-based embeddings for improved state-of-the-art (SOTA) fingerprint recognition (both authentication and identification). Our combined architecture, AFR-Net (Attention-Driven Fingerprint Recognition Network), outperforms several baseline transformer and CNN-based models, including a SOTA commercial fingerprint system, Verifinger v12.3, across intra-sensor, cross-sensor, and latent to rolled fingerprint matching datasets. Additionally, we propose a realignment strategy using local embeddings extracted from intermediate feature maps within the networks to refine the global embeddings in low certainty situations, which boosts the overall recognition accuracy significantly across each of the models. This realignment strategy requires no additional training and can be applied as a wrapper to any existing deep learning network (including attention-based, CNN-based, or both) to boost its performance.

updated: Sat Dec 03 2022 20:28:36 GMT+0000 (UTC)

published: Fri Nov 25 2022 05:10:39 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト