Teach me to segment with mixed supervision: Confident students become masters

Jose Dolz; Christian Desrosiers; Ismail Ben Ayed

混合監督でセグメント化するように教えてください：自信を持って生徒がマスターになる

ディープセグメンテーションニューラルネットワークは、ピクセル単位のセグメンテーションを備えた大規模なトレーニングデータセットを必要としますが、実際に取得するにはコストがかかります。混合監視は、完全なピクセル単位の注釈を含むデータのごく一部でこの問題を軽減できますが、残りは監視が少なく、たとえば、ほんの一握りのピクセルのみがラベル付けされます。この作業では、上部のブランチ（教師）が強力な注釈を受け取り、下部のブランチ（学生）が限られた監督によって駆動され、上部のブランチによってガイドされるデュアルブランチアーキテクチャを提案します。ラベル付けされたピクセルの標準クロスエントロピーと組み合わせて、私たちの新しい定式化は2つの重要な用語を統合します。（i）監視されていない画像に対して定義されたシャノンエントロピー損失。（ii）Kullback-Leibler（KL）ダイバージェンス。これは、強力に監視されたブランチによって生成された予測から、監視されていないブランチに知識を転送し、エントロピー（学生の信頼）項を導き、些細な解決策を回避します。非常に興味深いことに、エントロピーとKLダイバージェンスの相乗効果により、パフォーマンスが大幅に向上することを示しています。さらに、シャノンエントロピー最小化と標準の疑似マスク生成の間の興味深いリンクについて説明し、ラベルのないピクセルからの情報を活用するには、前者を後者よりも優先する必要があると主張します。一連の定量的および定性的実験を通じて、MRI画像で左心室心内膜をセグメント化する際の提案された製剤の有効性を示します。私たちの方法は、混合監視フレームワーク内でセマンティックセグメンテーションに取り組む他の戦略を大幅に上回っていることを示しています。さらに興味深いことに、分類における最近の観察と一致して、監督を減らして訓練された支部が教師を大幅に上回っていることを示しています。

Deep segmentation neural networks require large training datasets with pixel-wise segmentations, which are expensive to obtain in practice. Mixed supervision could mitigate this difficulty, with a small fraction of the data containing complete pixel-wise annotations, while the rest being less supervised, e.g., only a handful of pixels are labeled. In this work, we propose a dual-branch architecture, where the upper branch (teacher) receives strong annotations, while the bottom one (student) is driven by limited supervision and guided by the upper branch. In conjunction with a standard cross-entropy over the labeled pixels, our novel formulation integrates two important terms: (i) a Shannon entropy loss defined over the less-supervised images, which encourages confident student predictions at the bottom branch; and (ii) a Kullback-Leibler (KL) divergence, which transfers the knowledge from the predictions generated by the strongly supervised branch to the less-supervised branch, and guides the entropy (student-confidence) term to avoid trivial solutions. Very interestingly, we show that the synergy between the entropy and KL divergence yields substantial improvements in performances. Furthermore, we discuss an interesting link between Shannon-entropy minimization and standard pseudo-mask generation and argue that the former should be preferred over the latter for leveraging information from unlabeled pixels. Through a series of quantitative and qualitative experiments, we show the effectiveness of the proposed formulation in segmenting the left-ventricle endocardium in MRI images. We demonstrate that our method significantly outperforms other strategies to tackle semantic segmentation within a mixed-supervision framework. More interestingly, and in line with recent observations in classification, we show that the branch trained with reduced supervision largely outperforms the teacher.

updated: Tue Dec 15 2020 02:51:36 GMT+0000 (UTC)

published: Tue Dec 15 2020 02:51:36 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト