3DSAM-adapter: Holistic Adaptation of SAM from 2D to 3D for Promptable Medical Image Segmentation

Shizhan Gong; Yuan Zhong; Wenao Ma; Jinpeng Li; Zhao Wang; Jingyang Zhang; Pheng-Ann Heng; Qi Dou

3DSAM アダプター: 迅速な医療画像セグメンテーションのための 2D から 3D への SAM の全体的な適応

セグメント何でもモデル (SAM) は、日常画像に対して強力な一般化能力を備えた汎用セマンティックセグメンテーションで優れた結果を達成しましたが、医療画像セグメンテーションで実証されたパフォーマンスは、特にオブジェクトを含む腫瘍セグメンテーションタスクを扱う場合には精度が低く、安定していません。サイズが小さく、形状が不規則で、コントラストが低い。特に、元の SAM アーキテクチャは 2D 自然画像用に設計されているため、体積医療データから 3D 空間情報を効果的に抽出することはできません。この論文では、迅速な医用画像セグメンテーションのために SAM を 2D から 3D に変換するための新しい適応方法を提案します。アーキテクチャ変更のために総合的に設計されたスキームを通じて、事前にトレーニングされたパラメータの大部分を再利用のために保持しながら、ボリューム入力をサポートするために SAM を転送します。微調整プロセスはパラメータ効率の高い方法で実行され、事前トレーニングされたパラメータのほとんどは固定されたままとなり、少数の軽量空間アダプタのみが導入および調整されます。自然データと医療データの間の領域のギャップや 2D と 3D の間の空間配置の差異に関係なく、自然画像でトレーニングされたトランスフォーマーは、軽量の適応のみで体積医療画像に存在する空間パターンを効果的にキャプチャできます。私たちは 4 つのオープンソースの腫瘍セグメンテーションデータセットで実験を行っており、ワンクリックプロンプトで、私たちのモデルは 4 つのタスクのうち 3 つで、ドメインの最先端の医療画像セグメンテーションモデルよりも、具体的には 8.25%、29.87%、腎臓腫瘍、膵臓腫瘍、結腸がんのセグメンテーションでは 10.11%、肝臓腫瘍のセグメンテーションでは同様のパフォーマンスを達成します。また、適応方法を既存の一般的なアダプターと比較し、ほとんどのデータセットで大幅なパフォーマンスの向上が観察されました。

Despite that the segment anything model (SAM) achieved impressive results on general-purpose semantic segmentation with strong generalization ability on daily images, its demonstrated performance on medical image segmentation is less precise and not stable, especially when dealing with tumor segmentation tasks that involve objects of small sizes, irregular shapes, and low contrast. Notably, the original SAM architecture is designed for 2D natural images, therefore would not be able to extract the 3D spatial information from volumetric medical data effectively. In this paper, we propose a novel adaptation method for transferring SAM from 2D to 3D for promptable medical image segmentation. Through a holistically designed scheme for architecture modification, we transfer the SAM to support volumetric inputs while retaining the majority of its pre-trained parameters for reuse. The fine-tuning process is conducted in a parameter-efficient manner, wherein most of the pre-trained parameters remain frozen, and only a few lightweight spatial adapters are introduced and tuned. Regardless of the domain gap between natural and medical data and the disparity in the spatial arrangement between 2D and 3D, the transformer trained on natural images can effectively capture the spatial patterns present in volumetric medical images with only lightweight adaptations. We conduct experiments on four open-source tumor segmentation datasets, and with a single click prompt, our model can outperform domain state-of-the-art medical image segmentation models on 3 out of 4 tasks, specifically by 8.25%, 29.87%, and 10.11% for kidney tumor, pancreas tumor, colon cancer segmentation, and achieve similar performance for liver tumor segmentation. We also compare our adaptation method with existing popular adapters, and observed significant performance improvement on most datasets.

updated: Fri Jun 23 2023 12:09:52 GMT+0000 (UTC)

published: Fri Jun 23 2023 12:09:52 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト