Mixed-supervised segmentation: Confidence maximization helps knowledge distillation

Bingyuan Liu; Christian Desrosiers; Ismail Ben Ayed; Jose Dolz

混合教師ありセグメンテーション: 信頼性の最大化は知識の蒸留に役立ちます

幅広い医用画像セグメンテーションタスクで有望な結果を達成しているにもかかわらず、ディープニューラルネットワークには、ピクセル単位の注釈を含む大規模なトレーニングデータセットが必要です。これらの精選されたデータセットを取得することは、シナリオへの適用性を制限する面倒なプロセスです。混合監視は、この障害を軽減するための魅力的な代替手段です。この作業では、上部のブランチ (教師) が強力な注釈を受け取り、下部のブランチ (学生) が限られた監督によって駆動され、上部のブランチによって導かれる、デュアルブランチアーキテクチャを提案します。ラベル付けされたピクセルに対する標準的なクロスエントロピー損失と組み合わせることで、私たちの新しい定式化は 2 つの重要な用語を統合します。(i) 教師の少ない画像に対して定義されるシャノンエントロピー損失。 (ii) KL ダイバージェンス項。これは、強く監視されたブランチの知識 (つまり、予測) をあまり監視されていないブランチに転送し、エントロピー (学生の信頼) 項を導き、自明な解決策を回避します。エントロピーと KL ダイバージェンスの相乗効果により、パフォーマンスが大幅に向上することを示します。また、シャノンエントロピーの最小化と標準の擬似マスク生成の間の興味深いリンクについても説明し、ラベルのないピクセルからの情報を活用するためには、後者よりも前者を優先する必要があると主張します。公開されている 2 つのデータセットを使用して、一連の定量的および定性的な実験を通じて、提案された製剤の有効性を評価します。結果は、私たちの方法が、混合教師ありフレームワーク内のセマンティックセグメンテーションの他の戦略や、最近の半教師付きアプローチよりも大幅に優れていることを示しています。私たちのコードは、https://github.com/by-liu/ConfKD で公開されています。

Despite achieving promising results in a breadth of medical image segmentation tasks, deep neural networks require large training datasets with pixel-wise annotations. Obtaining these curated datasets is a cumbersome process which limits the applicability in scenarios. Mixed supervision is an appealing alternative for mitigating this obstacle. In this work, we propose a dual-branch architecture, where the upper branch (teacher) receives strong annotations, while the bottom one (student) is driven by limited supervision and guided by the upper branch. Combined with a standard cross-entropy loss over the labeled pixels, our novel formulation integrates two important terms: (i) a Shannon entropy loss defined over the less-supervised images, which encourages confident student predictions in the bottom branch; and (ii) a KL divergence term, which transfers the knowledge (i.e., predictions) of the strongly supervised branch to the less-supervised branch and guides the entropy (student-confidence) term to avoid trivial solutions. We show that the synergy between the entropy and KL divergence yields substantial improvements in performance. We also discuss an interesting link between Shannon-entropy minimization and standard pseudo-mask generation, and argue that the former should be preferred over the latter for leveraging information from unlabeled pixels. We evaluate the effectiveness of the proposed formulation through a series of quantitative and qualitative experiments using two publicly available datasets. Results demonstrate that our method significantly outperforms other strategies for semantic segmentation within a mixed-supervision framework, as well as recent semi-supervised approaches. Our code is publicly available: https://github.com/by-liu/ConfKD.

updated: Wed Nov 23 2022 06:01:49 GMT+0000 (UTC)

published: Tue Sep 21 2021 20:06:13 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト