Mixed-supervised segmentation: Confidence maximization helps knowledge distillation

Bingyuan Liu; Christian Desrosiers; Ismail Ben Ayed; Jose Dolz

混合教師ありセグメンテーション：信頼度の最大化は知識の蒸留に役立ちます

幅広い医療画像セグメンテーションタスクで有望な結果を達成しているにもかかわらず、ディープニューラルネットワークには、ピクセル単位の注釈を付けた大規模なトレーニングデータセットが必要です。これらのキュレートされたデータセットを取得することは、注釈付きの画像が不足しているシナリオでアプリケーションを制限する面倒なプロセスです。混合監視は、この障害を軽減するための魅力的な代替手段です。データのごく一部に完全なピクセル単位の注釈が含まれ、他の画像には監視の形式が弱くなります。この作業では、上部ブランチ（教師）が強力な注釈を受け取り、下部ブランチ（学生）が制限された監視によって駆動され、上部ブランチによってガイドされるデュアルブランチアーキテクチャを提案します。ラベル付けされたピクセルの標準的なクロスエントロピー損失と組み合わせて、私たちの新しい定式化は2つの重要な用語を統合します。（ii）カルバック・ライブラー（KL）発散項。これは、強く監視されているブランチの知識を監視されていないブランチに転送し、エントロピー（学生の自信）項を導き、自明な解決策を回避します。エントロピーとKLダイバージェンスの相乗効果により、パフォーマンスが大幅に向上することを示します。また、シャノンエントロピーの最小化と標準の疑似マスク生成の間の興味深いリンクについても説明し、ラベルのないピクセルからの情報を活用するには、前者を後者よりも優先する必要があると主張します。 2つの公開されているデータセットの定量的および定性的な結果は、私たちの方法が、混合監視フレームワーク内のセマンティックセグメンテーションの他の戦略や最近の半教師ありアプローチを大幅に上回っていることを示しています。さらに、監督を減らして訓練され、最上部の支部によって導かれた支部は、後者を大幅に上回っていることを示しています。

Despite achieving promising results in a breadth of medical image segmentation tasks, deep neural networks require large training datasets with pixel-wise annotations. Obtaining these curated datasets is a cumbersome process which limits the application in scenarios where annotated images are scarce. Mixed supervision is an appealing alternative for mitigating this obstacle, where only a small fraction of the data contains complete pixel-wise annotations and other images have a weaker form of supervision. In this work, we propose a dual-branch architecture, where the upper branch (teacher) receives strong annotations, while the bottom one (student) is driven by limited supervision and guided by the upper branch. Combined with a standard cross-entropy loss over the labeled pixels, our novel formulation integrates two important terms: (i) a Shannon entropy loss defined over the less-supervised images, which encourages confident student predictions in the bottom branch; and (ii) a Kullback-Leibler (KL) divergence term, which transfers the knowledge of the strongly supervised branch to the less-supervised branch and guides the entropy (student-confidence) term to avoid trivial solutions. We show that the synergy between the entropy and KL divergence yields substantial improvements in performance. We also discuss an interesting link between Shannon-entropy minimization and standard pseudo-mask generation, and argue that the former should be preferred over the latter for leveraging information from unlabeled pixels. Quantitative and qualitative results on two publicly available datasets demonstrate that our method significantly outperforms other strategies for semantic segmentation within a mixed-supervision framework, as well as recent semi-supervised approaches. Moreover, we show that the branch trained with reduced supervision and guided by the top branch largely outperforms the latter.

updated: Fri Oct 15 2021 05:02:33 GMT+0000 (UTC)

published: Tue Sep 21 2021 20:06:13 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト