Multi-Teacher Knowledge Distillation for Incremental Implicitly-Refined Classification

Longhui Yu; Zhenyu Weng; Yuqing Wang; Yuesheng Zhu

インクリメンタルな暗黙的に洗練された分類のためのマルチティーチャー知識蒸留

インクリメンタル学習法では、順次学習プロセスで最後のモデル（教師モデルとして）から現在のモデル（学生モデルとして）に知識を抽出することにより、新しいクラスを継続的に学習できます。ただし、これらのメソッドは、インクリメンタル暗黙的洗練分類（IIRC）では機能しません。これは、着信クラスがスーパークラスラベルとサブクラスラベルの2つの粒度レベルを持つ可能性があるインクリメンタル学習拡張機能です。これは、以前に学習したスーパークラスの知識が、順次学習したサブクラスの知識によって占められている可能性があるためです。この問題を解決するために、新しいマルチティーチャーナレッジ蒸留（MTKD）戦略を提案します。サブクラスの知識を保持するために、最後のモデルを一般的な教師として使用して、学生モデルの以前の知識を抽出します。スーパークラスの知識を保持するために、初期モデルにはスーパークラスの知識が豊富に含まれているため、スーパークラスの教師として初期モデルを使用してスーパークラスの知識を抽出します。ただし、2つの教師モデルから知識を抽出すると、生徒モデルが冗長な予測を行う可能性があります。さらに、冗長な予測を減らすために、Top-k予測制限と呼ばれる後処理メカニズムを提案します。 IIRC-ImageNet120およびIIRC-CIFAR100での実験結果は、提案された方法が既存の最先端の方法と比較してより良い分類精度を達成できることを示しています。

Incremental learning methods can learn new classes continually by distilling knowledge from the last model (as a teacher model) to the current model (as a student model) in the sequentially learning process. However, these methods cannot work for Incremental Implicitly-Refined Classification (IIRC), an incremental learning extension where the incoming classes could have two granularity levels, a superclass label and a subclass label. This is because the previously learned superclass knowledge may be occupied by the subclass knowledge learned sequentially. To solve this problem, we propose a novel Multi-Teacher Knowledge Distillation (MTKD) strategy. To preserve the subclass knowledge, we use the last model as a general teacher to distill the previous knowledge for the student model. To preserve the superclass knowledge, we use the initial model as a superclass teacher to distill the superclass knowledge as the initial model contains abundant superclass knowledge. However, distilling knowledge from two teacher models could result in the student model making some redundant predictions. We further propose a post-processing mechanism, called as Top-k prediction restriction to reduce the redundant predictions. Our experimental results on IIRC-ImageNet120 and IIRC-CIFAR100 show that the proposed method can achieve better classification accuracy compared with existing state-of-the-art methods.

updated: Wed Feb 23 2022 09:51:40 GMT+0000 (UTC)

published: Wed Feb 23 2022 09:51:40 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト