LGD: Label-guided Self-distillation for Object Detection

Peizhen Zhang; Zijian Kang; Tong Yang; Xiangyu Zhang; Nanning Zheng; Jian Sun

LGD：オブジェクト検出のためのラベルガイド付き自己蒸留

この論文では、LGD（ラベルガイド付き自己蒸留）と呼ばれる、一般的なオブジェクト検出のための最初の自己蒸留フレームワークを提案します。以前の研究は、実際のシナリオでは利用できない可能性のある有益な知識を提供するために、強力な事前訓練を受けた教師に依存しています。代わりに、オブジェクト間の関係間およびオブジェクト内の関係モデリングによって有益な知識を生成し、学生の表現と通常のラベルのみを必要とします。具体的には、私たちのフレームワークには、スパースラベル外観エンコーディング、オブジェクト間関係適応、およびオブジェクト内知識マッピングが含まれ、有益な知識を取得します。彼らは共同でトレーニング段階で暗黙の教師を形成し、ラベルと進化する学生の表現に動的に依存します。 LGDのモジュールは、学生検出器を使用してエンドツーエンドでトレーニングされ、推論で破棄されます。実験的に、LGDは、さまざまな検出器、データセット、およびインスタンスのセグメンテーションなどの広範なタスクで適切な結果を取得します。たとえば、MS-COCOデータセットでは、LGDはResNet-50を使用してRetinaNetを2倍のシングルスケールトレーニングで36.2％から39.0％mAP（+ 2.8％）に改善します。 2倍のマルチスケールトレーニングの下でResNeXt-101DCN v2を使用したFCOSのようなはるかに強力な検出器を46.1％から47.9％（+ 1.8％）にブーストします。 LGDは、従来の教師ベースの方法FGFIと比較して、事前に訓練された教師を必要とせずにパフォーマンスが向上するだけでなく、本来の学生の学習を超えて51％の訓練コストを削減します。

In this paper, we propose the first self-distillation framework for general object detection, termed LGD (Label-Guided self-Distillation). Previous studies rely on a strong pretrained teacher to provide instructive knowledge that could be unavailable in real-world scenarios. Instead, we generate an instructive knowledge by inter-and-intra relation modeling among objects, requiring only student representations and regular labels. Concretely, our framework involves sparse label-appearance encoding, inter-object relation adaptation and intra-object knowledge mapping to obtain the instructive knowledge. They jointly form an implicit teacher at training phase, dynamically dependent on labels and evolving student representations. Modules in LGD are trained end-to-end with student detector and are discarded in inference. Experimentally, LGD obtains decent results on various detectors, datasets, and extensive tasks like instance segmentation. For example in MS-COCO dataset, LGD improves RetinaNet with ResNet-50 under 2x single-scale training from 36.2% to 39.0% mAP (+ 2.8%). It boosts much stronger detectors like FCOS with ResNeXt-101 DCN v2 under 2x multi-scale training from 46.1% to 47.9% (+ 1.8%). Compared with a classical teacher-based method FGFI, LGD not only performs better without requiring pretrained teacher but also reduces 51% training cost beyond inherent student learning.

updated: Fri Dec 24 2021 07:12:44 GMT+0000 (UTC)

published: Thu Sep 23 2021 16:55:01 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト