CovSegNet: A Multi Encoder-Decoder Architecture for Improved Lesion Segmentation of COVID-19 Chest CT Scans

Tanvir Mahmud; Md Awsafur Rahman; Shaikh Anowarul Fattah; Sun-Yuan Kung

CovSegNet：COVID-19胸部CTスキャンの病変セグメンテーションを改善するためのマルチエンコーダ-デコーダアーキテクチャ

胸部CTスキャンの自動肺病変セグメンテーションは、COVID-19の正確な診断と重症度測定に向けた極めて重要な段階と見なされています。従来のU字型エンコーダ-デコーダアーキテクチャとそのバリアントは、プーリング/アップサンプリング操作でのコンテキスト情報の減少に悩まされ、エンコードおよびデコードされた特徴マップ間のセマンティックギャップが増加するだけでなく、最適ではない結果となるシーケンシャル勾配伝播の勾配消失問題を引き起こします。パフォーマンス。さらに、3D CTボリュームでの操作は、計算の複雑さが指数関数的に増加し、最適化が困難になるため、さらに制限があります。この論文では、自動化されたCOVID-19病変セグメンテーションスキームが、これらの制限を克服するために、非常に効率的なニューラルネットワークアーキテクチャ、すなわちCovSegNetを利用して提案されています。さらに、2フェーズトレーニングスキームが導入され、ROIが強化されたCTボリュームを生成するために深い2Dネットワークが採用され、その後、計算負荷を増やすことなく、より多くのコンテキスト情報でさらに強化されるために浅い3Dネットワークが採用されます。 Unetの従来の垂直拡張に加えて、最適なパフォーマンスを実現するために、多段エンコーダ-デコーダモジュールによる水平拡張を導入しました。さらに、マルチスケール機能マップがスケール移行プロセスに統合され、コンテキスト情報の損失を克服します。さらに、マルチスケール融合モジュールがピラミッド融合スキームとともに導入され、効率的な勾配伝播のための並列最適化を容易にしながら、後続のエンコーダー/デコーダーモジュール間のセマンティックギャップを削減します。卓越したパフォーマンスは、他の最先端のアプローチを大幅に上回る3つの公開されているデータセットで達成されています。提案されたスキームは、多種多様なアプリケーションで最適なセグメンテーションパフォーマンスを達成するために簡単に拡張できます。

Automatic lung lesions segmentation of chest CT scans is considered a pivotal stage towards accurate diagnosis and severity measurement of COVID-19. Traditional U-shaped encoder-decoder architecture and its variants suffer from diminutions of contextual information in pooling/upsampling operations with increased semantic gaps among encoded and decoded feature maps as well as instigate vanishing gradient problems for its sequential gradient propagation that result in sub-optimal performance. Moreover, operating with 3D CT-volume poses further limitations due to the exponential increase of computational complexity making the optimization difficult. In this paper, an automated COVID-19 lesion segmentation scheme is proposed utilizing a highly efficient neural network architecture, namely CovSegNet, to overcome these limitations. Additionally, a two-phase training scheme is introduced where a deeper 2D-network is employed for generating ROI-enhanced CT-volume followed by a shallower 3D-network for further enhancement with more contextual information without increasing computational burden. Along with the traditional vertical expansion of Unet, we have introduced horizontal expansion with multi-stage encoder-decoder modules for achieving optimum performance. Additionally, multi-scale feature maps are integrated into the scale transition process to overcome the loss of contextual information. Moreover, a multi-scale fusion module is introduced with a pyramid fusion scheme to reduce the semantic gaps between subsequent encoder/decoder modules while facilitating the parallel optimization for efficient gradient propagation. Outstanding performances have been achieved in three publicly available datasets that largely outperform other state-of-the-art approaches. The proposed scheme can be easily extended for achieving optimum segmentation performances in a wide variety of applications.

updated: Wed Dec 02 2020 19:26:35 GMT+0000 (UTC)

published: Wed Dec 02 2020 19:26:35 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト