Generalist Vision Foundation Models for Medical Imaging: A Case Study of Segment Anything Model on Zero-Shot Medical Segmentation

Peilun Shi; Jianing Qiu; Sai Mu Dalike Abaxi; Hao Wei; Frank P. -W. Lo; Wu Yuan

Generalist Vision Foundation Models for Medical Imaging: ゼロショット医療セグメンテーションに関するセグメントエニシングモデルのケーススタディ

医用画像に関する最近のセグメントエニシングモデル (SAM) を調べ、光干渉断層撮影法 (OCT)、磁気共鳴画像法 (OCT)、磁気共鳴画像法 ( MRI)、コンピューター断層撮影 (CT)、および皮膚科、眼科、放射線科などのさまざまなアプリケーション。私たちの実験では、SAM は一般的なドメインの画像では驚異的なセグメンテーションパフォーマンスを示していますが、医療画像などの分布外の画像では、そのゼロショットセグメンテーションパフォーマンスは依然として制限されていることがわかりました。さらに、SAM は、目に見えないさまざまな医療分野でさまざまなゼロショットセグメンテーションパフォーマンスを示しました。たとえば、網膜 OCT のブルッフ下膜層のセグメンテーションでは平均 Dice スコアが 0.8704 でしたが、網膜色素上皮のセグメンテーションではセグメンテーションの精度が 0.0688 に低下します。血管などの特定の構造化されたターゲットの場合、SAM のゼロショットセグメンテーションは完全に失敗しましたが、少量のデータで単純に微調整するだけで、セグメンテーションの品質が大幅に向上する可能性があります。私たちの研究は、医用画像の特定のタスクを解決するためのジェネラリストビジョン基盤モデルの多用途性と、微調整を通じて望ましいパフォーマンスを達成し、最終的には大規模で多様な医療データセットと複雑な医療ドメインにアクセスするという課題に取り組む大きな可能性を示しています。

We examine the recent Segment Anything Model (SAM) on medical images, and report both quantitative and qualitative zero-shot segmentation results on nine medical image segmentation benchmarks, covering various imaging modalities, such as optical coherence tomography (OCT), magnetic resonance imaging (MRI), and computed tomography (CT), as well as different applications including dermatology, ophthalmology, and radiology. Our experiments reveal that while SAM demonstrates stunning segmentation performance on images from the general domain, for those out-of-distribution images, e.g., medical images, its zero-shot segmentation performance is still limited. Furthermore, SAM demonstrated varying zero-shot segmentation performance across different unseen medical domains. For example, it had a 0.8704 mean Dice score on segmenting under-bruch's membrane layer of retinal OCT, whereas the segmentation accuracy drops to 0.0688 when segmenting retinal pigment epithelium. For certain structured targets, e.g., blood vessels, the zero-shot segmentation of SAM completely failed, whereas a simple fine-tuning of it with small amount of data could lead to remarkable improvements of the segmentation quality. Our study indicates the versatility of generalist vision foundation models on solving specific tasks in medical imaging, and their great potential to achieve desired performance through fine-turning and eventually tackle the challenges of accessing large diverse medical datasets and the complexity of medical domains.

updated: Tue Apr 25 2023 08:07:59 GMT+0000 (UTC)

published: Tue Apr 25 2023 08:07:59 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト