SimCVD: Simple Contrastive Voxel-Wise Representation Distillation for Semi-Supervised Medical Image Segmentation

Chenyu You; Yuan Zhou; Ruihan Zhao; Lawrence Staib; James S. Duncan

SimCVD：半教師あり医療画像セグメンテーションのための単純な対照ボクセルワイズ表現蒸留

医療画像分析における自動セグメンテーションは、手動でラベル付けされた大量のデータを必要とする困難な作業です。ただし、ほとんどの既存の学習ベースのアプローチは、通常、手動で注釈が付けられた限られた医療データに悩まされており、正確で堅牢な医療画像セグメンテーションにとって大きな実用上の問題を引き起こします。さらに、ほとんどの既存の半教師ありアプローチは、通常、教師ありのアプローチと比較して堅牢ではなく、幾何学的構造と意味情報の明示的なモデリングも欠いており、どちらもセグメンテーションの精度を制限します。この作業では、最先端のボクセル単位の表現学習を大幅に進歩させる単純な対照蒸留フレームワークであるSimCVDを紹介します。最初に、教師なしトレーニング戦略について説明します。これは、入力ボリュームの2つのビューを取得し、マスクとして2つの独立したドロップアウトのみを使用して、対照的な目的でオブジェクト境界の符号付き距離マップを予測します。この単純なアプローチは驚くほどうまく機能し、ラベル付けされたデータがはるかに少ない、以前の完全に監視された方法と同じレベルで実行されます。ドロップアウトは最小限の形式のデータ拡張と見なすことができ、ネットワークを表現の崩壊に対して堅牢にするという仮説を立てます。次に、ペアワイズ類似性を蒸留することによって構造蒸留を実行することを提案します。左心房セグメンテーションチャレンジ（LA）とNIH膵臓CTデータセットの2つの一般的なデータセットでSimCVDを評価します。 LAデータセットの結果は、2種類のラベル付き比率（つまり、20％と10％）で、SimCVDがそれぞれ90.85％と89.03％の平均ダイススコアを達成し、以前の最高値と比較して0.91％と2.22％の改善を達成することを示しています結果。私たちの方法は、エンドツーエンドの方法でトレーニングすることができ、医用画像の合成、強調、登録などのダウンストリームタスクの一般的なフレームワークとしてSimCVDを利用する可能性を示しています。

Automated segmentation in medical image analysis is a challenging task that requires a large amount of manually labeled data. However, most existing learning-based approaches usually suffer from limited manually annotated medical data, which poses a major practical problem for accurate and robust medical image segmentation. In addition, most existing semi-supervised approaches are usually not robust compared with the supervised counterparts, and also lack explicit modeling of geometric structure and semantic information, both of which limit the segmentation accuracy. In this work, we present SimCVD, a simple contrastive distillation framework that significantly advances state-of-the-art voxel-wise representation learning. We first describe an unsupervised training strategy, which takes two views of an input volume and predicts their signed distance maps of object boundaries in a contrastive objective, with only two independent dropout as mask. This simple approach works surprisingly well, performing on the same level as previous fully supervised methods with much less labeled data. We hypothesize that dropout can be viewed as a minimal form of data augmentation and makes the network robust to representation collapse. Then, we propose to perform structural distillation by distilling pair-wise similarities. We evaluate SimCVD on two popular datasets: the Left Atrial Segmentation Challenge (LA) and the NIH pancreas CT dataset. The results on the LA dataset demonstrate that, in two types of labeled ratios (i.e., 20% and 10%), SimCVD achieves an average Dice score of 90.85% and 89.03% respectively, a 0.91% and 2.22% improvement compared to previous best results. Our method can be trained in an end-to-end fashion, showing the promise of utilizing SimCVD as a general framework for downstream tasks, such as medical image synthesis, enhancement, and registration.

updated: Mon Jan 17 2022 17:08:26 GMT+0000 (UTC)

published: Fri Aug 13 2021 13:17:58 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト