Data-Free Learning of Student Networks

Hanting Chen; Yunhe Wang; Chang Xu; Zhaohui Yang; Chuanjian Liu; Boxin Shi; Chunjing Xu; Chao Xu; Qi Tian

学生ネットワークのデータフリー学習

ポータブルニューラルネットワークの学習は、事前学習済みの重いディープモデルを携帯電話やマイクロセンサーなどのエッジデバイスに適切に適用できるという目的のために、コンピュータービジョンにとって非常に重要です。既存のほとんどのディープニューラルネットワークの圧縮と高速化の方法は、トレーニングデータセットに直接アクセスできるコンパクトなディープモデルのトレーニングに非常に効果的です。ただし、特定のディープネットワークのトレーニングデータは、いくつかの実践上の問題（プライバシー、法的問題、伝送など）により利用できないことが多く、一部のインターフェイスを除き、特定のネットワークのアーキテクチャも不明です。この目的のために、生成的敵対ネットワーク（GAN）を活用して効率的なディープニューラルネットワークをトレーニングするための新しいフレームワークを提案します。具体的には、事前訓練された教師ネットワークは固定の弁別器と見なされ、弁別器で最大の応答を得ることができる訓練サンプルを生成するためにジェネレータが利用されます。次に、生成されたデータと教師ネットワークを同時に使用して、モデルサイズが小さく計算が複雑な効率的なネットワークがトレーニングされます。提案されたデータフリーラーニング（DAFL）メソッドを使用して学習された効率的な学生ネットワークは、CIFAR-10およびCIFAR-100データセットのトレーニングデータなしでResNet-18を使用してそれぞれ92.22％および74.47％の精度を達成します。一方、学生ネットワークはCelebAベンチマークで80.56％の精度を獲得しています。

Learning portable neural networks is very essential for computer vision for the purpose that pre-trained heavy deep models can be well applied on edge devices such as mobile phones and micro sensors. Most existing deep neural network compression and speed-up methods are very effective for training compact deep models, when we can directly access the training dataset. However, training data for the given deep network are often unavailable due to some practice problems (e.g. privacy, legal issue, and transmission), and the architecture of the given network are also unknown except some interfaces. To this end, we propose a novel framework for training efficient deep neural networks by exploiting generative adversarial networks (GANs). To be specific, the pre-trained teacher networks are regarded as a fixed discriminator and the generator is utilized for derivating training samples which can obtain the maximum response on the discriminator. Then, an efficient network with smaller model size and computational complexity is trained using the generated data and the teacher network, simultaneously. Efficient student networks learned using the proposed Data-Free Learning (DAFL) method achieve 92.22% and 74.47% accuracies using ResNet-18 without any training data on the CIFAR-10 and CIFAR-100 datasets, respectively. Meanwhile, our student network obtains an 80.56% accuracy on the CelebA benchmark.

updated: Tue Dec 31 2019 06:58:35 GMT+0000 (UTC)

published: Tue Apr 02 2019 03:00:06 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト