About Explicit Variance Minimization: Training Neural Networks for Medical Imaging With Limited Data Annotations

Dmitrii Shubin; Danny Eytan; Sebastian D. Goodfellow

明示的分散最小化について: 限られたデータ注釈を使用した医療画像処理のためのニューラルネットワークのトレーニング

コンピュータービジョンの自己教師あり学習方法は、特徴表現の事前トレーニングの有効性を示しており、注釈付きのデータが限られている場合でも、ディープニューラルネットワークを適切に一般化できます。ただし、表現学習手法では、モデルのトレーニングにかなりの時間が必要であり、そのほとんどの時間が正確なハイパーパラメーターの最適化と拡張手法の選択に費やされます。我々は、例えば、組織神話の保存された類似性のために、医療画像で一般的であるように、注釈付きデータセットに一般集団をキャプチャするのに十分な形態学的多様性がある場合、訓練されたモデルの分散エラーがバイアス分散の一般的なコンポーネントであると仮定しました。トレード・オフ。モデルの損失関数に分散エラーを導入することにより、つまり分散を明示的に最小化することにより、この特性を利用する分散認識トレーニング (VAT) メソッドを提案します。さらに、アプローチの解釈を支援するために、提案された方法の理論的定式化と証明を提供します。私たちの方法では、ハイパーパラメータを 1 つだけ選択する必要があり、GPU トレーニング時間の桁違いの削減を達成しながら、自己監視方法の最先端のパフォーマンスに匹敵するか改善することができました。さまざまな分野とさまざまな学習目的からの 3 つの医療画像データセットの VAT を検証しました。これらには、心臓のセマンティックセグメンテーション (MICCAI 2017 ACDC チャレンジ) 用の磁気共鳴画像 (MRI) データセット、糖尿病性網膜症の進行の通常の回帰のための眼底写真データセット (Kaggle 2019 APTOS 盲検検出チャレンジ)、およびリンパ節セクションの組織病理学的スキャンの分類が含まれます。 (PatchCamelyon データセット)。

Self-supervised learning methods for computer vision have demonstrated the effectiveness of pre-training feature representations, resulting in well-generalizing Deep Neural Networks, even if the annotated data are limited. However, representation learning techniques require a significant amount of time for model training, with most of it time spent on precise hyper-parameter optimization and selection of augmentation techniques. We hypothesized that if the annotated dataset has enough morphological diversity to capture the general population's as is common in medical imaging, for example, due to conserved similarities of tissue mythologies, the variance error of the trained model is the prevalent component of the Bias-Variance Trade-off. We propose the Variance Aware Training (VAT) method that exploits this property by introducing the variance error into the model loss function, i.e., enabling minimizing the variance explicitly. Additionally, we provide the theoretical formulation and proof of the proposed method to aid in interpreting the approach. Our method requires selecting only one hyper-parameter and was able to match or improve the state-of-the-art performance of self-supervised methods while achieving an order of magnitude reduction in the GPU training time. We validated VAT on three medical imaging datasets from diverse domains and various learning objectives. These included a Magnetic Resonance Imaging (MRI) dataset for the heart semantic segmentation (MICCAI 2017 ACDC challenge), fundus photography dataset for ordinary regression of diabetic retinopathy progression (Kaggle 2019 APTOS Blindness Detection challenge), and classification of histopathologic scans of lymph node sections (PatchCamelyon dataset).

updated: Wed Jul 14 2021 17:43:48 GMT+0000 (UTC)

published: Fri May 28 2021 21:34:04 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト