A Closer Look at Prototype Classifier for Few-shot Image Classification

Mingcheng Hou; Issei Sato

少数ショット画像分類のためのプロトタイプ分類子の詳細

プロトタイプネットワークは、メタ学習に基づくプロトタイプの分類器であり、メタテスト中にハイパーパラメーターを調整することなく、クラス固有のプロトタイプを構築することによって目に見えない例を分類するため、少数ショット学習に広く使用されます。興味深いことに、最近の研究は多くの注目を集めており、メタ学習アルゴリズムを使用しない新しい線形分類器のトレーニングが、プロトタイプのネットワークと同等に機能することが示されています。ただし、新しい線形分類器をトレーニングするには、新しいクラスが現れるたびに分類器を再トレーニングする必要があります。この論文では、新しい線形分類器のトレーニングやメタ学習を行わなくても、プロトタイプの分類器がどのように同等に機能するかを分析します。標準の事前トレーニング済みモデルを使用して抽出された特徴ベクトルを直接使用して、メタテストでプロトタイプ分類器を構築することは、プロトタイプのネットワークほどうまく機能せず、事前の特徴ベクトルで新しい線形分類器をトレーニングすることを実験的に見つけました。 -訓練されたモデル。したがって、プロトタイプ分類器の新しい一般化境界を導出し、特徴ベクトルの変換がプロトタイプ分類器のパフォーマンスを向上させることができることを示します。派生境界を最小化するためのいくつかの正規化方法を実験的に調査し、L2 正規化を使用して、新しい分類器のトレーニングやメタ学習を行わずにクラス間分散に対するクラス内分散の比率を最小化することで、同じパフォーマンスが得られることを発見しました。 .

The prototypical network is a prototype classifier based on meta-learning and is widely used for few-shot learning because it classifies unseen examples by constructing class-specific prototypes without adjusting hyper-parameters during meta-testing. Interestingly, recent research has attracted a lot of attention, showing that training a new linear classifier, which does not use a meta-learning algorithm, performs comparably with the prototypical network. However, the training of a new linear classifier requires the retraining of the classifier every time a new class appears. In this paper, we analyze how a prototype classifier works equally well without training a new linear classifier or meta-learning. We experimentally find that directly using the feature vectors, which is extracted by using standard pre-trained models to construct a prototype classifier in meta-testing, does not perform as well as the prototypical network and training new linear classifiers on the feature vectors of pre-trained models. Thus, we derive a novel generalization bound for a prototypical classifier and show that the transformation of a feature vector can improve the performance of prototype classifiers. We experimentally investigate several normalization methods for minimizing the derived bound and find that the same performance can be obtained by using the L2 normalization and minimizing the ratio of the within-class variance to the between-class variance without training a new classifier or meta-learning.

updated: Thu Sep 15 2022 06:37:35 GMT+0000 (UTC)

published: Mon Oct 11 2021 08:28:43 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト