arXiv reaDer
Relative Difficulty Distillation for Semantic Segmentation
Current knowledge distillation (KD) methods primarily focus on transferring various structured knowledge and designing corresponding optimization goals to encourage the student network to imitate the output of the teacher network. However, introducing too many additional optimization objectives may lead to unstable training, such as gradient conflicts. Moreover, these methods ignored the guidelines of relative learning difficulty between the teacher and student networks. Inspired by human cognitive science, in this paper, we redefine knowledge from a new perspective -- the student and teacher networks' relative difficulty of samples, and propose a pixel-level KD paradigm for semantic segmentation named Relative Difficulty Distillation (RDD). We propose a two-stage RDD framework: Teacher-Full Evaluated RDD (TFE-RDD) and Teacher-Student Evaluated RDD (TSE-RDD). RDD allows the teacher network to provide effective guidance on learning focus without additional optimization goals, thus avoiding adjusting learning weights for multiple losses. Extensive experimental evaluations using a general distillation loss function on popular datasets such as Cityscapes, CamVid, Pascal VOC, and ADE20k demonstrate the effectiveness of RDD against state-of-the-art KD methods. Additionally, our research showcases that RDD can integrate with existing KD methods to improve their upper performance bound.
updated: Thu Jul 04 2024 08:08:25 GMT+0000 (UTC)
published: Thu Jul 04 2024 08:08:25 GMT+0000 (UTC)
参考文献 (このサイトで利用可能なもの) / References (only if available on this site)
被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)アソシエイト