Mutual Attention-based Hybrid Dimensional Network for Multimodal Imaging Computer-aided Diagnosis

Yin Dai; Yifan Gao; Fayu Liu; Jun Fu

マルチモーダルイメージングコンピュータ支援診断のための相互注意ベースのハイブリッド次元ネットワーク

マルチモーダル3Dコンピューター支援診断に関する最近の研究は、3D畳み込みニューラルネットワーク（CNN）がより多くのパラメーターをもたらし、医用画像が不足している場合に競争力のある自動診断モデルを取得することは重要で困難なままであることを示しています。マルチモーダル画像の関心領域の一貫性と診断精度の両方を考慮して、マルチモーダル3D医用画像分類（MMNet）用の新しい相互注意ベースのハイブリッド次元ネットワークを提案します。ハイブリッド次元ネットワークは、2D CNNを3D畳み込みモジュールと統合して、より深く、より有益な特徴マップを生成し、3D融合のトレーニングの複雑さを軽減します。さらに、ImageNetの事前トレーニング済みモデルを2D CNNで使用できるため、モデルのパフォーマンスが向上します。立体視の注意は、3D医用画像で領域の豊富なコンテキストの相互依存性を構築することに焦点を当てています。マルチモーダル医療画像の病理組織の地域相関を改善するために、ネットワーク内の相互注意フレームワークをさらに設計して、異なる画像モダリティの同様の立体領域で領域ごとの一貫性を構築し、ネットワークに焦点を合わせるように指示する暗黙の方法を提供します病理組織。 MMNetは、これまでの多くのソリューションを上回り、3つのマルチモーダルイメージングデータセット、つまり耳下腺腫瘍（PGT）データセット、MRNetデータセット、およびPROSTATExデータセットで最先端の結果を達成し、その利点は広範な検証によって検証されています。実験。

Recent works on Multimodal 3D Computer-aided diagnosis have demonstrated that obtaining a competitive automatic diagnosis model when a 3D convolution neural network (CNN) brings more parameters and medical images are scarce remains nontrivial and challenging. Considering both consistencies of regions of interest in multimodal images and diagnostic accuracy, we propose a novel mutual attention-based hybrid dimensional network for MultiModal 3D medical image classification (MMNet). The hybrid dimensional network integrates 2D CNN with 3D convolution modules to generate deeper and more informative feature maps, and reduce the training complexity of 3D fusion. Besides, the pre-trained model of ImageNet can be used in 2D CNN, which improves the performance of the model. The stereoscopic attention is focused on building rich contextual interdependencies of the region in 3D medical images. To improve the regional correlation of pathological tissues in multimodal medical images, we further design a mutual attention framework in the network to build the region-wise consistency in similar stereoscopic regions of different image modalities, providing an implicit manner to instruct the network to focus on pathological tissues. MMNet outperforms many previous solutions and achieves results competitive to the state-of-the-art on three multimodal imaging datasets, i.e., Parotid Gland Tumor (PGT) dataset, the MRNet dataset, and the PROSTATEx dataset, and its advantages are validated by extensive experiments.

updated: Mon Jan 24 2022 02:31:25 GMT+0000 (UTC)

published: Mon Jan 24 2022 02:31:25 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト