A Quantitative Comparison of Epistemic Uncertainty Maps Applied to Multi-Class Segmentation

Robin Camarasa; Daniel Bos; Jeroen Hendrikse; Paul Nederkoorn; M. Eline Kooi; Aad van der Lugt; Marleen de Bruijne

マルチクラスセグメンテーションに適用された認識論的不確実性マップの定量的比較

不確実性の評価は、医療画像分析への急速な関心を集めています。認識論的不確実性を計算するための一般的な手法は、モンテカルロ（MC）ドロップアウト手法です。 MCドロップアウトと単一の入力を備えたネットワークから、複数の出力をサンプリングできます。これらの複数の出力から認識論的不確実性マップを取得するには、さまざまな方法を使用できます。マルチクラスセグメンテーションの場合、認識論的不確実性はクラスごとにボクセル単位で、または画像ごとにボクセル単位で計算できるため、メソッドの数はさらに多くなります。このホワイトペーパーでは、クラス固有の認識論的不確実性マップ（画像ごとに1つの値、ボクセルとクラス）と結合された認識論的不確実性マップ（画像とボクセルごとに1つの値）の2つの異なるコンテキストでこれらの方法を定義し、定量的に比較する体系的なアプローチに焦点を当てます。この定量分析を、（MR）画像のマルチセンター、マルチスキャナー、マルチシーケンスデータセットで、頸動脈内腔と血管壁のマルチクラスセグメンテーションに適用しました。モデルの144セットのハイパーパラメータについて分析を検証しました。私たちの主な分析は、認識論的不確実性の値に従ってソートされたボクセルの順序と予測の誤分類との関係を考慮しています。この考慮事項の下で、結合された不確実性マップの比較は、マルチクラスエントロピーとマルチクラス相互情報量が調査中の他の結合された不確実性マップより統計的に優れていることを明らかにします。クラス固有のシナリオでは、1対すべてのエントロピーが統計的にクラスごとのエントロピー、クラスごとの分散、および1対すべての相互情報量を上回ります。クラスごとのエントロピーは、キャリブレーションの点で他のクラス固有の不確実性マップよりも統計的に優れています。さまざまなデータとタスクの分析を再現するために、Pythonパッケージを利用できるようにしました。

Uncertainty assessment has gained rapid interest in medical image analysis. A popular technique to compute epistemic uncertainty is the Monte-Carlo (MC) dropout technique. From a network with MC dropout and a single input, multiple outputs can be sampled. Various methods can be used to obtain epistemic uncertainty maps from those multiple outputs. In the case of multi-class segmentation, the number of methods is even larger as epistemic uncertainty can be computed voxelwise per class or voxelwise per image. This paper highlights a systematic approach to define and quantitatively compare those methods in two different contexts: class-specific epistemic uncertainty maps (one value per image, voxel and class) and combined epistemic uncertainty maps (one value per image and voxel). We applied this quantitative analysis to a multi-class segmentation of the carotid artery lumen and vessel wall, on a multi-center, multi-scanner, multi-sequence dataset of (MR) images. We validated our analysis over 144 sets of hyperparameters of a model. Our main analysis considers the relationship between the order of the voxels sorted according to their epistemic uncertainty values and the misclassification of the prediction. Under this consideration, the comparison of combined uncertainty maps reveals that the multi-class entropy and the multi-class mutual information statistically out-perform the other combined uncertainty maps under study. In a class-specific scenario, the one-versus-all entropy statistically out-performs the class-wise entropy, the class-wise variance and the one versus all mutual information. The class-wise entropy statistically out-performs the other class-specific uncertainty maps in terms of calibration. We made a python package available to reproduce our analysis on different data and tasks.

updated: Wed Sep 22 2021 12:48:19 GMT+0000 (UTC)

published: Wed Sep 22 2021 12:48:19 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト