CDFKD-MFS: Collaborative Data-free Knowledge Distillation via Multi-level Feature Sharing

Zhiwei Hao; Yong Luo; Zhi Wang; Han Hu; Jianping An

CDFKD-MFS：マルチレベルの機能共有によるデータを使用しない共同知識の抽出

最近、インテリジェントサービスを提供するためのリソースが制限されたエッジデバイスでの強力なディープニューラルネットワーク（DNN）の圧縮と展開が魅力的なタスクになっています。知識蒸留（KD）は圧縮のための実行可能なソリューションですが、元のデータセットに対するその要件はプライバシーの懸念を引き起こします。さらに、満足のいくパフォーマンスを達成するために、複数の事前トレーニング済みモデルを統合するのが一般的です。特に元のデータが利用できない場合、複数のモデルを小さなモデルに圧縮する方法は困難です。この課題に取り組むために、マルチヘッダー学生モジュール、非対称の敵対的データフリーKDモジュール、および注意で構成される、マルチレベル機能共有（CDFKD-MFS）を介した協調的データフリー知識蒸留と呼ばれるフレームワークを提案します。ベースの集約モジュール。このフレームワークでは、マルチレベルの機能共有構造を備えた学生モデルは、複数の教師モデルから学習し、非対称の敵対的な方法でジェネレーターと一緒にトレーニングされます。いくつかの実際のサンプルが利用可能な場合、アテンションモジュールは、学生ヘッダーの予測を適応的に集約します。これにより、パフォーマンスをさらに向上させることができます。 3つの人気のあるコンピュータービジュアルデータセットで広範な実験を行います。特に、最も競争力のある代替案と比較して、提案されたフレームワークの精度は、CIFAR-100データセットで1.18％高く、Caltech-101データセットで1.67％高く、mini-ImageNetデータセットで2.99％高くなっています。

Recently, the compression and deployment of powerful deep neural networks (DNNs) on resource-limited edge devices to provide intelligent services have become attractive tasks. Although knowledge distillation (KD) is a feasible solution for compression, its requirement on the original dataset raises privacy concerns. In addition, it is common to integrate multiple pretrained models to achieve satisfactory performance. How to compress multiple models into a tiny model is challenging, especially when the original data are unavailable. To tackle this challenge, we propose a framework termed collaborative data-free knowledge distillation via multi-level feature sharing (CDFKD-MFS), which consists of a multi-header student module, an asymmetric adversarial data-free KD module, and an attention-based aggregation module. In this framework, the student model equipped with a multi-level feature-sharing structure learns from multiple teacher models and is trained together with a generator in an asymmetric adversarial manner. When some real samples are available, the attention module adaptively aggregates predictions of the student headers, which can further improve performance. We conduct extensive experiments on three popular computer visual datasets. In particular, compared with the most competitive alternative, the accuracy of the proposed framework is 1.18% higher on the CIFAR-100 dataset, 1.67% higher on the Caltech-101 dataset, and 2.99% higher on the mini-ImageNet dataset.

updated: Tue May 24 2022 07:11:03 GMT+0000 (UTC)

published: Tue May 24 2022 07:11:03 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト