A Multimodal Sensor Fusion Framework Robust to Missing Modalities for Person Recognition

Vijay John; Yasutomo Kawanishi

人認識のためのモダリティの欠落に対してロバストなマルチモーダルセンサーフュージョンフレームワーク

音声カメラ、可視カメラ、サーマルカメラのセンサー特性を活かし、人物認識のロバスト性を高めることができます。既存のマルチモーダル個人認識フレームワークは、主にマルチモーダルデータが常に利用可能であることを前提として定式化されています。この論文では、オーディオ、可視、およびサーマルカメラを使用して、モダリティの欠落の問題に対処する新しいトライモーダルセンサーフュージョンフレームワークを提案します。このフレームワークでは、複数の潜在的な埋め込みを学習するために、AVTNet と呼ばれる新しい深い潜在的な埋め込みフレームワークが提案されています。また、失われたモダリティ損失と呼ばれる新しい損失関数は、個々の潜在的な埋め込みを学習しながら、トリプレット損失計算に基づいて失われた可能性のあるモダリティを説明します。さらに、トリモーダルデータを利用した共同潜在埋め込みは、マルチヘッド Attention Transformer を使用して学習され、異なるモダリティに Attention 重みが割り当てられます。その後、さまざまな潜在的な埋め込みを使用して、ディープニューラルネットワークをトレーニングします。提案されたフレームワークは、Speaking Faces データセットで検証されます。ベースラインアルゴリズムとの比較分析は、提案されたフレームワークがモダリティの欠落を考慮しながら人物認識の精度を大幅に向上させることを示しています。

Utilizing the sensor characteristics of the audio, visible camera, and thermal camera, the robustness of person recognition can be enhanced. Existing multimodal person recognition frameworks are primarily formulated assuming that multimodal data is always available. In this paper, we propose a novel trimodal sensor fusion framework using the audio, visible, and thermal camera, which addresses the missing modality problem. In the framework, a novel deep latent embedding framework, termed the AVTNet, is proposed to learn multiple latent embeddings. Also, a novel loss function, termed missing modality loss, accounts for possible missing modalities based on the triplet loss calculation while learning the individual latent embeddings. Additionally, a joint latent embedding utilizing the trimodal data is learnt using the multi-head attention transformer, which assigns attention weights to the different modalities. The different latent embeddings are subsequently used to train a deep neural network. The proposed framework is validated on the Speaking Faces dataset. A comparative analysis with baseline algorithms shows that the proposed framework significantly increases the person recognition accuracy while accounting for missing modalities.

updated: Sat Oct 22 2022 04:51:51 GMT+0000 (UTC)

published: Thu Oct 20 2022 02:39:48 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト