Comparison Knowledge Translation for Generalizable Image Classification

Zunlei Feng; Tian Qiu; Sai Wu; Xiaotuan Jin; Zengliang He; Mingli Song; Huiqiong Wang

一般化可能な画像分類のための比較知識翻訳

ディープラーニングは最近、大量の注釈に大きく依存する画像分類タスクで驚くべきパフォーマンスを達成しました。ただし、既存の深層学習モデルの分類メカニズムは、人間の認識メカニズムとは対照的であるように思われます。タイプが不明なオブジェクトの画像を一目見ただけで、人間は大量の画像から他の同じカテゴリのオブジェクトをすばやく正確に見つけることができます。これは、さまざまなオブジェクトを毎日認識することでメリットが得られます。この論文では、他のカテゴリの注釈のサポートにより、見えないカテゴリの分類パフォーマンスを向上させることを期待して、画像分類タスクで人間の認識メカニズムをエミュレートする一般化可能なフレームワークの構築を試みます。具体的には、Comparison Knowledge Translation（CKT）と呼ばれる新しいタスクを調査します。完全にラベル付けされたカテゴリのセットが与えられると、CKTは、ラベル付けされたカテゴリから学習した比較知識を新しいカテゴリのセットに変換することを目的としています。この目的のために、比較分類器とマッチング弁別器で構成される比較分類翻訳ネットワーク（CCT-Net）を提案します。比較分類器は、2つの画像が同じカテゴリに属するかどうかを分類するために考案され、一致する弁別器は、分類された結果が真実と一致するかどうかを確認するために敵対的な方法で連携します。徹底的な実験により、CCT-Netは、目に見えないカテゴリで驚くべき一般化能力を達成し、ターゲットカテゴリでSOTAパフォーマンスを達成することが示されています。

Deep learning has recently achieved remarkable performance in image classification tasks, which depends heavily on massive annotation. However, the classification mechanism of existing deep learning models seems to contrast to humans' recognition mechanism. With only a glance at an image of the object even unknown type, humans can quickly and precisely find other same category objects from massive images, which benefits from daily recognition of various objects. In this paper, we attempt to build a generalizable framework that emulates the humans' recognition mechanism in the image classification task, hoping to improve the classification performance on unseen categories with the support of annotations of other categories. Specifically, we investigate a new task termed Comparison Knowledge Translation (CKT). Given a set of fully labeled categories, CKT aims to translate the comparison knowledge learned from the labeled categories to a set of novel categories. To this end, we put forward a Comparison Classification Translation Network (CCT-Net), which comprises a comparison classifier and a matching discriminator. The comparison classifier is devised to classify whether two images belong to the same category or not, while the matching discriminator works together in an adversarial manner to ensure whether classified results match the truth. Exhaustive experiments show that CCT-Net achieves surprising generalization ability on unseen categories and SOTA performance on target categories.

updated: Sat May 07 2022 11:05:18 GMT+0000 (UTC)

published: Sat May 07 2022 11:05:18 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト