Adaptive Distillation: Aggregating Knowledge from Multiple Paths for Efficient Distillation

Sumanth Chennupati; Mohammad Mahdi Kamani; Zhongwei Cheng; Lin Chen

適応蒸留：効率的な蒸留のための複数のパスからの知識の集約

知識の蒸留は、大規模な教師モデルからのガイダンスにより、小規模な学生モデルの一般化パフォーマンスを向上させるためのニューラルネットワーク圧縮アルゴリズムの主要なトレンドの1つになりつつあります。知識蒸留のアプリケーションのこの重大な増加は、ソフトターゲットやヒントレイヤーなどの知識を蒸留するための多数のアルゴリズムの導入を伴います。知識を蒸留するためのさまざまな技術におけるこの進歩にもかかわらず、蒸留のためのさまざまな経路の集約は包括的に研究されていません。これは、パスが異なれば重要性も異なるだけでなく、パスによっては学生モデルの一般化パフォーマンスに悪影響を与える可能性があるため、特に重要です。したがって、蒸留が学生モデルに与える影響を最大化するために、各パスの重要性を適応的に調整する必要があります。このホワイトペーパーでは、これらのさまざまなパスを集約するためのさまざまなアプローチを検討し、マルチタスク学習方法に基づいて提案された適応アプローチを紹介します。分類、セマンティックセグメンテーション、およびオブジェクト検出タスクでの知識蒸留のアプリケーションに関する他のベースラインに対する提案されたアプローチの有効性を経験的に示します。

Knowledge Distillation is becoming one of the primary trends among neural network compression algorithms to improve the generalization performance of a smaller student model with guidance from a larger teacher model. This momentous rise in applications of knowledge distillation is accompanied by the introduction of numerous algorithms for distilling the knowledge such as soft targets and hint layers. Despite this advancement in different techniques for distilling the knowledge, the aggregation of different paths for distillation has not been studied comprehensively. This is of particular significance, not only because different paths have different importance, but also due to the fact that some paths might have negative effects on the generalization performance of the student model. Hence, we need to adaptively adjust the importance of each path to maximize the impact of distillation on the student model. In this paper, we explore different approaches for aggregating these different paths and introduce our proposed adaptive approach based on multitask learning methods. We empirically demonstrate the effectiveness of the proposed approach over other baselines on the applications of knowledge distillation in classification, semantic segmentation, and object detection tasks.

updated: Tue Oct 19 2021 00:57:40 GMT+0000 (UTC)

published: Tue Oct 19 2021 00:57:40 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト