Cross-layer Navigation Convolutional Neural Network for Fine-grained Visual Classification

Chenyu Guo; Jiyang Xie; Kongming Liang; Xian Sun; Zhanyu Ma

きめ細かい視覚分類のためのクロスレイヤーナビゲーション畳み込みニューラルネットワーク

きめ細かい視覚分類（FGVC）は、オブジェクトのサブクラスを同じスーパークラスに分類することを目的としています（たとえば、鳥の種、車のモデル）。 FGVCタスクの場合、重要な解決策は、ローカル領域からターゲットの識別可能な微妙な情報を見つけることです。従来のFGVCモデルは、洗練された機能、つまり認識のための高レベルのセマンティック情報を使用することを好み、低レベルの情報を使用することはめったにありません。ただし、豊富な詳細情報を含む低レベルの情報もパフォーマンスの向上に影響を与えることがわかりました。したがって、本論文では、特徴融合のためのクロスレイヤーナビゲーション畳み込みニューラルネットワークを提案します。まず、バックボーンネットワークによって抽出された特徴マップが、畳み込み長短期記憶モデルに高レベルから低レベルへと順番に入力され、特徴の集約が実行されます。次に、特徴融合後に注意メカニズムを使用して、高レベルのセマンティック情報と低レベルのテクスチャ特徴をリンクしながら、空間情報とチャネル情報を抽出します。これにより、FGVCの識別領域をより適切に特定できます。実験では、CUB-200-2011、Stanford-Cars、およびFGVC-Aircraftデータセットを含む3つの一般的に使用されるFGVCデータセットを評価に使用し、提案された方法の優位性を他の参照されたFGVC方法と比較して示します。この方法は優れた結果を達成します。

Fine-grained visual classification (FGVC) aims to classify sub-classes of objects in the same super-class (e.g., species of birds, models of cars). For the FGVC tasks, the essential solution is to find discriminative subtle information of the target from local regions. TraditionalFGVC models preferred to use the refined features,i.e., high-level semantic information for recognition and rarely use low-level in-formation. However, it turns out that low-level information which contains rich detail information also has effect on improving performance. Therefore, in this paper, we propose cross-layer navigation convolutional neural network for feature fusion. First, the feature maps extracted by the backbone network are fed into a convolutional long short-term memory model sequentially from high-level to low-level to perform feature aggregation. Then, attention mechanisms are used after feature fusion to extract spatial and channel information while linking the high-level semantic information and the low-level texture features, which can better locate the discriminative regions for the FGVC. In the experiments, three commonly used FGVC datasets, including CUB-200-2011, Stanford-Cars, andFGVC-Aircraft datasets, are used for evaluation and we demonstrate the superiority of the proposed method by comparing it with other referred FGVC methods to show that this method achieves superior results.

updated: Mon Jun 21 2021 08:38:27 GMT+0000 (UTC)

published: Mon Jun 21 2021 08:38:27 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト