3D Axial-Attention for Lung Nodule Classification

Mundher Al-Shabi; Kelvin Shak; Maxine Tan

肺結節分類のための 3D アキシャルアテンション

目的: 近年、非局所ベースの方法が肺結節分類にうまく適用されています。ただし、これらの方法では、低解像度のフィーチャマップに対して 2D アテンションまたは限定された 3D アテンションが提供されます。さらに、完全な 3D アテンションは計算にコストがかかり、大きなデータセットが必要になるため、畳み込みなどの便利なローカルフィルターに依然として依存しています。方法: 3D Axial-Attention を使用することを提案します。これには、通常の非ローカルネットワーク (つまり、セルフアテンション) の計算能力のほんの一部しか必要ありません。通常の非ローカルネットワークとは異なり、3D Axial-Attention ネットワークは各軸に個別にアテンション操作を適用します。さらに、共有埋め込みに 3D 位置エンコーディングを追加することを提案することにより、非ローカルネットワークの不変位置の問題を解決します。結果: 442 の良性結節と 406 の悪性結節で提案された方法を検証し、公開された LIDC-IDRI データセットから抽出した、少なくとも 3 人の放射線科医によって注釈が付けられた結節のみを使用した厳密な実験設定に従いました。私たちの結果は、3D Axial-Attention モデルが、AUC や精度を含むすべての評価指標で最先端のパフォーマンスを達成していることを示しています。結論: 提案されたモデルは完全な 3D 注目を提供し、3D ボリューム空間内のすべての要素 (つまり、ピクセル) が結節内の他のすべての要素に効果的に対応します。したがって、3D Axial-Attention ネットワークは、ローカルフィルターを必要とせずにすべてのレイヤーで使用できます。実験結果は、肺結節を分類するための完全な 3D 注意の重要性を示しています。

Purpose: In recent years, Non-Local based methods have been successfully applied to lung nodule classification. However, these methods offer 2D attention or limited 3D attention to low-resolution feature maps. Moreover, they still depend on a convenient local filter such as convolution as full 3D attention is expensive to compute and requires a big dataset, which might not be available. Methods: We propose to use 3D Axial-Attention, which requires a fraction of the computing power of a regular Non-Local network (i.e., self-attention). Unlike a regular Non-Local network, the 3D Axial-Attention network applies the attention operation to each axis separately. Additionally, we solve the invariant position problem of the Non-Local network by proposing to add 3D positional encoding to shared embeddings. Results: We validated the proposed method on 442 benign nodules and 406 malignant nodules, extracted from the public LIDC-IDRI dataset by following a rigorous experimental setup using only nodules annotated by at least three radiologists. Our results show that the 3D Axial-Attention model achieves state-of-the-art performance on all evaluation metrics, including AUC and Accuracy. Conclusions: The proposed model provides full 3D attention, whereby every element (i.e., pixel) in the 3D volume space attends to every other element in the nodule effectively. Thus, the 3D Axial-Attention network can be used in all layers without the need for local filters. The experimental results show the importance of full 3D attention for classifying lung nodules.

updated: Tue Jun 01 2021 10:43:26 GMT+0000 (UTC)

published: Mon Dec 28 2020 06:49:09 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト