Online Knowledge Distillation via Mutual Contrastive Learning for Visual Recognition

Chuanguang Yang; Zhulin An; Helong Zhou; Yongjun Xu; Qian Zhan

視覚認識のための相互対照学習によるオンライン知識蒸留

教師なしのオンラインKnowledgeDistillation（KD）は、複数の学生モデルのアンサンブルを共同でトレーニングし、互いに知識を抽出することを目的としています。既存のオンラインKDメソッドは望ましいパフォーマンスを実現しますが、貴重な機能表現情報を無視して、コア知識タイプとしてクラス確率に焦点を合わせることがよくあります。オンラインKDのための相互対照学習（MCL）フレームワークを提示します。 MCLの中心的な考え方は、ネットワークのコホート間で相互作用と対照的な分布の転送をオンラインで実行することです。当社のMCLは、ネットワーク間の埋め込み情報を集約し、2つのネットワーク間の相互情報量の下限を最大化できます。これにより、各ネットワークは他のネットワークからさらに対照的な知識を学習できるようになり、より優れた特徴表現が可能になり、視覚認識タスクのパフォーマンスが向上します。最終層を超えて、補助機能改良モジュールによって支援されたいくつかの中間層にMCLを拡張します。これにより、オンラインKDの表現学習の機能がさらに強化されます。画像分類と視覚認識タスクへの転移学習に関する実験は、MCLが最先端のオンラインKDアプローチに対して一貫したパフォーマンスの向上につながる可能性があることを示しています。この優位性は、MCLがネットワークをガイドしてより優れた機能表現を生成できることを示しています。私たちのコードはhttps://github.com/winycg/MCLで公開されています。

The teacher-free online Knowledge Distillation (KD) aims to train an ensemble of multiple student models collaboratively and distill knowledge from each other. Although existing online KD methods achieve desirable performance, they often focus on class probabilities as the core knowledge type, ignoring the valuable feature representational information. We present a Mutual Contrastive Learning (MCL) framework for online KD. The core idea of MCL is to perform mutual interaction and transfer of contrastive distributions among a cohort of networks in an online manner. Our MCL can aggregate cross-network embedding information and maximize the lower bound to the mutual information between two networks. This enables each network to learn extra contrastive knowledge from others, leading to better feature representations, thus improving the performance of visual recognition tasks. Beyond the final layer, we extend MCL to several intermediate layers assisted by auxiliary feature refinement modules. This further enhances the ability of representation learning for online KD. Experiments on image classification and transfer learning to visual recognition tasks show that MCL can lead to consistent performance gains against state-of-the-art online KD approaches. The superiority demonstrates that MCL can guide the network to generate better feature representations. Our code is publicly available at https://github.com/winycg/MCL.

updated: Sat Jul 23 2022 13:39:01 GMT+0000 (UTC)

published: Sat Jul 23 2022 13:39:01 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト