Ariadne's Thread:Using Text Prompts to Improve Segmentation of Infected Areas from Chest X-ray images

Yi Zhong; Mengqiu Xu; Kongming Liang; Kaixin Chen; Ming Wu

Ariadne のスレッド: テキストプロンプトを使用して胸部 X 線画像からの感染領域のセグメンテーションを改善する

肺の感染領域のセグメント化は、肺感染症などの肺疾患の重症度を定量化するために不可欠です。既存の医用画像セグメンテーション手法は、画像に基づいたほぼ単峰性の手法です。ただし、これらの画像のみの方法は、大量の注釈付きデータを使用してトレーニングしない限り、不正確な結果を生成する傾向があります。この課題を克服するために、テキストプロンプトを使用してセグメンテーション結果を改善する言語駆動型のセグメンテーション方法を提案します。 QaTa-COV19 データセットの実験では、私たちの方法が単峰性方法と比較して Dice スコアを少なくとも 6.09% 改善することが示されています。さらに、私たちの広範な研究は、テキストの情報の粒度の点でマルチモーダル手法の柔軟性を明らかにし、必要なトレーニングデータのサイズの点でマルチモーダル手法が画像のみの手法よりも大きな利点があることを示しています。

Segmentation of the infected areas of the lung is essential for quantifying the severity of lung disease like pulmonary infections. Existing medical image segmentation methods are almost uni-modal methods based on image. However, these image-only methods tend to produce inaccurate results unless trained with large amounts of annotated data. To overcome this challenge, we propose a language-driven segmentation method that uses text prompt to improve to the segmentation result. Experiments on the QaTa-COV19 dataset indicate that our method improves the Dice score by 6.09% at least compared to the uni-modal methods. Besides, our extended study reveals the flexibility of multi-modal methods in terms of the information granularity of text and demonstrates that multi-modal methods have a significant advantage over image-only methods in terms of the size of training data required.

updated: Sat Jul 08 2023 09:36:17 GMT+0000 (UTC)

published: Sat Jul 08 2023 09:36:17 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト