Comparative Evaluation of 3D and 2D Deep Learning Techniques for Semantic Segmentation in CT Scans

Abhishek Shivdeo; Rohit Lokwani; Viraj Kulkarni; Amit Kharat; Aniruddha Pant

CTスキャンにおけるセマンティックセグメンテーションのための3Dおよび2D深層学習技術の比較評価

画像セグメンテーションは、関心領域のセグメンテーションを支援することにより、いくつかの医用画像アプリケーションで極めて重要な役割を果たします。ディープラーニングベースのアプローチは、医療データのセマンティックセグメンテーションに広く採用されています。近年、2D深層学習アーキテクチャに加えて、3Dアーキテクチャが3D医用画像データの予測アルゴリズムとして採用されています。この論文では、3Dコンピュータ断層撮影（CT）スキャンで統合とすりガラス状の不透明度の兆候をセグメント化するための3Dスタックベースの深層学習手法を提案します。また、セグメンテーションの結果、保持されているコンテキスト情報、およびこの3D手法と従来の2D深層学習手法との推論時間に基づいた比較も示します。また、これらの深層学習モデルによって予測された病理領域のスライスごとの領域で観察された特有のパターンを表す領域プロットを定義します。私たちの徹底的な評価では、CTスキャンのセグメンテーションに関して3D技術は2D技術よりも優れています。 3Dと2Dのテクニックでそれぞれ79％と73％のサイコロスコアを取得します。 3D手法では、2D手法と比較して推論時間が5分の1に短縮されます。結果は、3Dモデルによって予測されたエリアプロットが、2Dモデルによって予測されたものよりもグラウンドトゥルースに類似していることも示しています。また、トレーニング中に保持されるコンテキスト情報の量を増やすと、3Dモデルのパフォーマンスがどのように向上するかを示します。

Image segmentation plays a pivotal role in several medical-imaging applications by assisting the segmentation of the regions of interest. Deep learning-based approaches have been widely adopted for semantic segmentation of medical data. In recent years, in addition to 2D deep learning architectures, 3D architectures have been employed as the predictive algorithms for 3D medical image data. In this paper, we propose a 3D stack-based deep learning technique for segmenting manifestations of consolidation and ground-glass opacities in 3D Computed Tomography (CT) scans. We also present a comparison based on the segmentation results, the contextual information retained, and the inference time between this 3D technique and a traditional 2D deep learning technique. We also define the area-plot, which represents the peculiar pattern observed in the slice-wise areas of the pathology regions predicted by these deep learning models. In our exhaustive evaluation, 3D technique performs better than the 2D technique for the segmentation of CT scans. We get dice scores of 79% and 73% for the 3D and the 2D techniques respectively. The 3D technique results in a 5X reduction in the inference time compared to the 2D technique. Results also show that the area-plots predicted by the 3D model are more similar to the ground truth than those predicted by the 2D model. We also show how increasing the amount of contextual information retained during the training can improve the 3D model's performance.

updated: Tue Jan 19 2021 13:23:43 GMT+0000 (UTC)

published: Tue Jan 19 2021 13:23:43 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト