Knowledge Distillation Using Hierarchical Self-Supervision Augmented Distribution

Chuanguang Yang; Zhulin An; Linhang Cai; Yongjun Xu

階層的自己監視拡張分布を使用した知識の蒸留

知識蒸留（KD）は、大きな教師から小さな生徒に意味のある情報を転送することを目的とした効果的なフレームワークです。一般に、KDには、知識を定義して伝達する方法が含まれることがよくあります。以前のKD手法では、多くの場合、機能マップや洗練された情報など、さまざまな形式の知識のマイニングに重点が置かれていました。ただし、知識は主な監視対象タスクから派生しているため、タスク固有のものです。自己教師あり表現学習の最近の成功に動機付けられて、より意味のある機能を学習するようにネットワークを導くための補助的な自己教師あり拡張タスクを提案します。したがって、KDのこのタスクから、より豊かな暗い知識として、ソフトな自己監視拡張分布を導出できます。以前の知識とは異なり、この分布は、教師ありおよび自己教師あり特徴学習からの共同知識をエンコードします。知識の探索を超えて、階層的な特徴マップを十分に活用するために、さまざまな隠れ層にいくつかの補助ブランチを追加することを提案します。各補助ブランチは、自己管理の拡張タスクを学習し、この分布を教師から生徒に抽出するようにガイドされます。全体として、KDメソッドを階層的自己監視拡張知識蒸留（HSSAKD）と呼びます。標準的な画像分類の実験は、オフラインとオンラインの両方のHSSAKDがKDの分野で最先端のパフォーマンスを達成することを示しています。オブジェクト検出に関するさらなる転送実験により、HSSAKDがネットワークをガイドしてより優れた機能を学習できることがさらに検証されます。コードはhttps://github.com/winycg/HSAKDで入手できます。

Knowledge distillation (KD) is an effective framework that aims to transfer meaningful information from a large teacher to a smaller student. Generally, KD often involves how to define and transfer knowledge. Previous KD methods often focus on mining various forms of knowledge, for example, feature maps and refined information. However, the knowledge is derived from the primary supervised task and thus is highly task-specific. Motivated by the recent success of self-supervised representation learning, we propose an auxiliary self-supervision augmented task to guide networks to learn more meaningful features. Therefore, we can derive soft self-supervision augmented distributions as richer dark knowledge from this task for KD. Unlike previous knowledge, this distribution encodes joint knowledge from supervised and self-supervised feature learning. Beyond knowledge exploration, we propose to append several auxiliary branches at various hidden layers, to fully take advantage of hierarchical feature maps. Each auxiliary branch is guided to learn self-supervision augmented task and distill this distribution from teacher to student. Overall, we call our KD method as Hierarchical Self-Supervision Augmented Knowledge Distillation (HSSAKD). Experiments on standard image classification show that both offline and online HSSAKD achieves state-of-the-art performance in the field of KD. Further transfer experiments on object detection further verify that HSSAKD can guide the network to learn better features. The code is available at https://github.com/winycg/HSAKD.

updated: Sat Jul 23 2022 09:58:08 GMT+0000 (UTC)

published: Tue Sep 07 2021 13:29:32 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト