The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image Classification

Dongliang Chang; Yifeng Ding; Jiyang Xie; Ayan Kumar Bhunia; Xiaoxu Li; Zhanyu Ma; Ming Wu; Jun Guo; Yi-Zhe Song

悪魔はチャネルにあります：細粒度の画像分類のための相互チャネル損失

きめ細かい画像分類を解決するための鍵は、微妙な視覚特性に対応する識別領域と局所領域を見つけることです。部品レベルの識別機能表現を学習するために特別に設計された複雑なネットワークにより、大きな進歩が遂げられました。このホワイトペーパーでは、過度に複雑なネットワーク設計やトレーニングメカニズムを必要とせずに、微妙な詳細を育成することが可能であることを示します。たった1つの損失で十分です。主なトリックは、統合された機能マップから開始する慣例とは対照的に、個々の機能チャネルを早期に掘り下げる方法にあります。相互チャネル損失（MC-Loss）と呼ばれる提案された損失関数は、2つのチャネル固有のコンポーネント（差別性コンポーネントとダイバーシティコンポーネント）で構成されています。差別性コンポーネントは、新しいクラスごとのアテンションメカニズムにより、同じクラスに属するすべての機能チャネルを差別的にします。さらに、ダイバーシティコンポーネントは、空間的に相互に排他的になるようにチャネルを制限します。したがって、最終結果は、それぞれが特定のクラスの異なるローカル識別領域を反映する機能チャネルのセットです。 MC-Losは、境界ボックス/パーツの注釈を必要とせずに、エンドツーエンドでトレーニングでき、推論中に高度に識別可能な領域を生成します。実験結果は、共通ベースネットワーク上に実装した場合のMC-Lossが、4つすべての詳細な分類データセット（CUB-Birds、FGVC-Aircraft、Flowers-102、およびStanford-Cars）で最先端のパフォーマンスを達成できることを示しています）。アブレーション研究では、2つの異なるベースネットワークで視覚分類のために最近提案された他の汎用損失と比較した場合、MC損失の優位性をさらに示しています。 https://github.com/dongliangchang/Mutual-Channel-Lossで利用可能なコード

Key for solving fine-grained image categorization is finding discriminate and local regions that correspond to subtle visual traits. Great strides have been made, with complex networks designed specifically to learn part-level discriminate feature representations. In this paper, we show it is possible to cultivate subtle details without the need for overly complicated network designs or training mechanisms -- a single loss is all it takes. The main trick lies with how we delve into individual feature channels early on, as opposed to the convention of starting from a consolidated feature map. The proposed loss function, termed as mutual-channel loss (MC-Loss), consists of two channel-specific components: a discriminality component and a diversity component. The discriminality component forces all feature channels belonging to the same class to be discriminative, through a novel channel-wise attention mechanism. The diversity component additionally constraints channels so that they become mutually exclusive on spatial-wise. The end result is therefore a set of feature channels that each reflects different locally discriminative regions for a specific class. The MC-Loss can be trained end-to-end, without the need for any bounding-box/part annotations, and yields highly discriminative regions during inference. Experimental results show our MC-Loss when implemented on top of common base networks can achieve state-of-the-art performance on all four fine-grained categorization datasets (CUB-Birds, FGVC-Aircraft, Flowers-102, and Stanford-Cars). Ablative studies further demonstrate the superiority of MC-Loss when compared with other recently proposed general-purpose losses for visual classification, on two different base networks. Code available at https://github.com/dongliangchang/Mutual-Channel-Loss

updated: Tue Aug 10 2021 04:23:56 GMT+0000 (UTC)

published: Tue Feb 11 2020 09:12:45 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト