Knowledge Distillation via Instance-level Sequence Learning

Haoran Zhao; Xin Sun; Junyu Dong; Zihe Dong; Qiong Li

インスタンスレベルのシーケンス学習による知識の抽出

最近、蒸留アプローチは、学生ネットワークを導くために教師ネットワークから一般的な知識を抽出するために提案されています。既存の方法のほとんどは、データから均一にサンプリングされたランダムなミニバッチのシーケンスを供給することにより、教師ネットワークから生徒に知識を転送します。代わりに、意味のある順序で並べられたサンプルを使用して、コンパクトな学生ネットワークを徐々にガイドする必要があると主張します。したがって、教師と生徒のネットワーク間の特徴表現のギャップを段階的に埋めることができます。この作業では、インスタンスレベルのシーケンス学習を介してカリキュラム学習知識蒸留フレームワークを提供します。初期の学生ネットワークをスナップショットとして使用して、学生ネットワークの次のトレーニングフェーズのカリキュラムを作成します。 CIFAR-10、CIFAR-100、SVHN、CINIC-10のデータセットで広範な実験を行っています。いくつかの最先端の方法と比較して、私たちのフレームワークは、より少ない反復で最高のパフォーマンスを実現します。

Recently, distillation approaches are suggested to extract general knowledge from a teacher network to guide a student network. Most of the existing methods transfer knowledge from the teacher network to the student via feeding the sequence of random mini-batches sampled uniformly from the data. Instead, we argue that the compact student network should be guided gradually using samples ordered in a meaningful sequence. Thus, it can bridge the gap of feature representation between the teacher and student network step by step. In this work, we provide a curriculum learning knowledge distillation framework via instance-level sequence learning. It employs the student network of the early epoch as a snapshot to create a curriculum for the student network's next training phase. We carry out extensive experiments on CIFAR-10, CIFAR-100, SVHN and CINIC-10 datasets. Compared with several state-of-the-art methods, our framework achieves the best performance with fewer iterations.

updated: Mon Jun 21 2021 06:58:26 GMT+0000 (UTC)

published: Mon Jun 21 2021 06:58:26 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト