Few-Shot Lifelong Learning

Pratik Mazumder; Pravendra Singh; Piyush Rai

数ショットの生涯学習

多くの実際の分類問題には、ラベル付けされたトレーニングサンプルが非常に少ないクラスが含まれていることがよくあります。さらに、すべての可能なクラスが最初はトレーニングに利用できるとは限らず、段階的に提供される場合があります。ディープラーニングモデルは、実際の状況でうまく機能するために、この2つの問題に対処する必要があります。この論文では、深層学習モデルが数ショットのデータに対して生涯/継続学習を実行できるようにする、新しい数ショット生涯学習（FSLL）手法を提案します。この方法では、モデル全体をトレーニングするのではなく、新しいクラスのセットをすべてトレーニングするために、モデルからごくわずかなパラメーターを選択します。これは、過剰適合を防ぐのに役立ちます。現在重要でないパラメーターのみが選択されるように、モデルからいくつかのパラメーターを選択します。モデル内の重要なパラメーターをそのまま維持することにより、私たちのアプローチは壊滅的な忘却を最小限に抑えます。さらに、新しいクラスと古いクラスのプロトタイプ間のコサイン類似度を最小化して、それらの分離を最大化し、それによって分類パフォーマンスを向上させます。また、私たちの方法を自己監視と統合すると、モデルのパフォーマンスが大幅に向上することも示しています。私たちの方法は、miniImageNet、CIFAR-100、およびCUB-200データセットの既存の方法を大幅に上回っていることを実験的に示しています。具体的には、CUBデータセットの絶対マージンが19.27％で、最先端の方法を上回っています。

Many real-world classification problems often have classes with very few labeled training samples. Moreover, all possible classes may not be initially available for training, and may be given incrementally. Deep learning models need to deal with this two-fold problem in order to perform well in real-life situations. In this paper, we propose a novel Few-Shot Lifelong Learning (FSLL) method that enables deep learning models to perform lifelong/continual learning on few-shot data. Our method selects very few parameters from the model for training every new set of classes instead of training the full model. This helps in preventing overfitting. We choose the few parameters from the model in such a way that only the currently unimportant parameters get selected. By keeping the important parameters in the model intact, our approach minimizes catastrophic forgetting. Furthermore, we minimize the cosine similarity between the new and the old class prototypes in order to maximize their separation, thereby improving the classification performance. We also show that integrating our method with self-supervision improves the model performance significantly. We experimentally show that our method significantly outperforms existing methods on the miniImageNet, CIFAR-100, and CUB-200 datasets. Specifically, we outperform the state-of-the-art method by an absolute margin of 19.27% for the CUB dataset.

updated: Mon Mar 01 2021 13:26:57 GMT+0000 (UTC)

published: Mon Mar 01 2021 13:26:57 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト