Leveraging Different Learning Styles for Improved Knowledge Distillation

Usma Niyaz; Deepti R. Bathula

知識の蒸留を改善するためのさまざまな学習スタイルの活用

学習スタイルとは、新しい知識を得るために個人が採用する一種の訓練メカニズムを指します。 VARK モデルで示唆されているように、人間は、情報を取得して効果的に処理するために、視覚、聴覚など、さまざまな学習の好みを持っています。この概念に着想を得て、私たちの仕事は、知識の抽出 (KD) と相互学習 (ML) のコンテキストでモデル圧縮を使用した混合情報共有のアイデアを探ります。すべてのネットワークで同じ種類の知識を共有する従来の手法とは異なり、学習プロセスを強化するために、個々のネットワークをさまざまな形式の情報でトレーニングすることを提案します。予測と機能マップの形で情報を共有または交換する 1 人の教師と 2 つの学生ネットワークを使用して、KD と ML を組み合わせたフレームワークを策定します。ベンチマークの分類およびセグメンテーションデータセットを使用した包括的な実験では、15% の圧縮により、さまざまな形式の知識でトレーニングされたネットワークのアンサンブルパフォーマンスが従来の手法よりも定量的および定性的に優れていることが示されました。

Learning style refers to a type of training mechanism adopted by an individual to gain new knowledge. As suggested by the VARK model, humans have different learning preferences like visual, auditory, etc., for acquiring and effectively processing information. Inspired by this concept, our work explores the idea of mixed information sharing with model compression in the context of Knowledge Distillation (KD) and Mutual Learning (ML). Unlike conventional techniques that share the same type of knowledge with all networks, we propose to train individual networks with different forms of information to enhance the learning process. We formulate a combined KD and ML framework with one teacher and two student networks that share or exchange information in the form of predictions and feature maps. Our comprehensive experiments with benchmark classification and segmentation datasets demonstrate that with 15% compression, the ensemble performance of networks trained with diverse forms of knowledge outperforms the conventional techniques both quantitatively and qualitatively.

updated: Mon Mar 06 2023 11:48:02 GMT+0000 (UTC)

published: Tue Dec 06 2022 12:40:45 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト