Multi-Attention-Based Soft Partition Network for Vehicle Re-Identification

Sangrok Lee; Taekang Woo; Sang Hun Lee

車両再識別のためのマルチアテンションベースのソフトパーティションネットワーク

車両の再識別は、同じ車両の画像と他の車両の画像を区別するのに役立ちます。これは、異なるビューから見た同一の車両間のインスタンス内での大きな違いと、類似した車両間のインスタンス間の微妙な違いのため、困難なプロセスです。この問題を解決するために、研究者らは空間アテンションメカニズムを介してビューを意識した特徴やパーツ固有の特徴を抽出しました。これにより通常、ノイズの多いアテンションマップが生成されるか、品質を向上させるためにキーポイントなどのメタデータに高価な追加のアノテーションが必要になります。一方、研究者の洞察に基づいて、特定の視点や車両部品向けにさまざまな手作りのマルチアテンションアーキテクチャが提案されています。ただし、このアプローチは、注意ブランチの数と性質が現実世界の再識別タスクに最適であることを保証するものではありません。これらの問題に対処するために、我々は、さまざまな識別領域を異なる視点からより効率的に捕捉するための多重ソフトアテンション機構に基づく新しい車両再識別ネットワークを提案した。さらに、このモデルは、重要でない領域のアテンションマップを作成し、最終結果の生成から除外する新しい方法を考案することにより、空間アテンションマップのノイズを大幅に削減できます。また、車両再識別のための重要な意味属性を効率的に選択するために、チャネルごとの注意メカニズムと空間的注意メカニズムを組み合わせました。私たちの実験では、提案したモデルがメタデータを使用しないアテンションベースの手法の中で最先端のパフォーマンスを達成し、VehicleID および VERI-Wild データセットのメタデータを使用するアプローチと同等であることが示されました。

Vehicle re-identification helps in distinguishing between images of the same and other vehicles. It is a challenging process because of significant intra-instance differences between identical vehicles from different views and subtle inter-instance differences between similar vehicles. To solve this issue, researchers have extracted view-aware or part-specific features via spatial attention mechanisms, which usually result in noisy attention maps or otherwise require expensive additional annotation for metadata, such as key points, to improve the quality. Meanwhile, based on the researchers' insights, various handcrafted multi-attention architectures for specific viewpoints or vehicle parts have been proposed. However, this approach does not guarantee that the number and nature of attention branches will be optimal for real-world re-identification tasks. To address these problems, we proposed a new vehicle re-identification network based on a multiple soft attention mechanism for capturing various discriminative regions from different viewpoints more efficiently. Furthermore, this model can significantly reduce the noise in spatial attention maps by devising a new method for creating an attention map for insignificant regions and then excluding it from generating the final result. We also combined a channel-wise attention mechanism with a spatial attention mechanism for the efficient selection of important semantic attributes for vehicle re-identification. Our experiments showed that our proposed model achieved a state-of-the-art performance among the attention-based methods without metadata and was comparable to the approaches using metadata for the VehicleID and VERI-Wild datasets.

updated: Wed Aug 02 2023 07:58:00 GMT+0000 (UTC)

published: Wed Apr 21 2021 08:13:17 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト