Transformer Utilization in Medical Image Segmentation Networks

Saikat Roy; Gregor Koehler; Michael Baumgartner; Constantin Ulrich; Jens Petersen; Fabian Isensee; Klaus Maier-Hein

医用画像セグメンテーションネットワークにおけるトランスの利用

自然画像のデータが豊富な領域での成功により、トランスフォーマーは最近、医療画像のセグメンテーションで人気を博しています。ただし、さまざまなアーキテクチャの順列でトランスフォーマーと畳み込みブロックを組み合わせると、その相対的な有効性は自由な解釈に委ねられます。この有効性を定量化するために、Transformer ブロックを単純な線形演算子に置き換える Transformer Ablations を導入します。 2 つの医用画像セグメンテーションタスクに関する 8 つのモデルの実験により、次のことを探究します。 Transformer ブロック内の明示的な機能階層は、付随する自己注意モジュールよりも有益です。4) Transformer モジュールの前の主要な空間ダウンサンプリングは注意して使用する必要があります。

Owing to success in the data-rich domain of natural images, Transformers have recently become popular in medical image segmentation. However, the pairing of Transformers with convolutional blocks in varying architectural permutations leaves their relative effectiveness to open interpretation. We introduce Transformer Ablations that replace the Transformer blocks with plain linear operators to quantify this effectiveness. With experiments on 8 models on 2 medical image segmentation tasks, we explore -- 1) the replaceable nature of Transformer-learnt representations, 2) Transformer capacity alone cannot prevent representational replaceability and works in tandem with effective design, 3) The mere existence of explicit feature hierarchies in transformer blocks is more beneficial than accompanying self-attention modules, 4) Major spatial downsampling before Transformer modules should be used with caution.

updated: Sun Apr 09 2023 12:35:22 GMT+0000 (UTC)

published: Sun Apr 09 2023 12:35:22 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト