M2KD: Multi-model and Multi-level Knowledge Distillation for Incremental Learning

Peng Zhou; Long Mai; Jianming Zhang; Ning Xu; Zuxuan Wu; Larry S. Davis

M2KD：増分学習のためのマルチモデルおよびマルチレベルの知識蒸留

インクリメンタル学習は、古いカテゴリを忘れずに新しいカテゴリで優れたパフォーマンスを達成することを目標としています。古いクラスのパフォーマンスを維持するには、知識の蒸留が重要であることが示されています。ただし、従来の方法では、最後のモデルからのみ知識を順番に抽出するため、後の段階的な学習ステップで古いクラスのパフォーマンスが低下します。この論文では、マルチモデルとマルチレベルの知識抽出戦略を提案します。最後のモデルからのみ知識を順次抽出するのではなく、以前のすべてのモデルのスナップショットを直接利用します。さらに、中間特徴レベルでエンコードされた知識をさらに保存するために、補助蒸留を組み込んでいます。モデルのメモリ効率を高めるために、マスクベースのプルーニングを適用して、以前のすべてのモデルを小さなメモリフットプリントで再構築します。標準のインクリメンタル学習ベンチマークでの実験は、私たちの方法が古いクラスの知識をよりよく保持し、標準の蒸留技術よりも全体的なパフォーマンスを向上させることを示しています。

Incremental learning targets at achieving good performance on new categories without forgetting old ones. Knowledge distillation has been shown critical in preserving the performance on old classes. Conventional methods, however, sequentially distill knowledge only from the last model, leading to performance degradation on the old classes in later incremental learning steps. In this paper, we propose a multi-model and multi-level knowledge distillation strategy. Instead of sequentially distilling knowledge only from the last model, we directly leverage all previous model snapshots. In addition, we incorporate an auxiliary distillation to further preserve knowledge encoded at the intermediate feature levels. To make the model more memory efficient, we adapt mask based pruning to reconstruct all previous models with a small memory footprint. Experiments on standard incremental learning benchmarks show that our method preserves the knowledge on old classes better and improves the overall performance over standard distillation techniques.

updated: Sat Sep 05 2020 04:41:31 GMT+0000 (UTC)

published: Wed Apr 03 2019 04:54:01 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト