MedCLIP: Contrastive Learning from Unpaired Medical Images and Text

Zifeng Wang; Zhenbang Wu; Dinesh Agarwal; Jimeng Sun

MedCLIP:対になっていない医療画像とテキストからの対照的な学習

CLIP のような既存のビジョンとテキストの対照的な学習は、ペアの画像とキャプションの埋め込みを一致させ、他のものを引き離すことを目的としています。これにより、表現の転送可能性が向上し、ゼロショット予測がサポートされます。ただし、医療画像とテキストのデータセットは、インターネットの一般的な画像やキャプションよりも桁違いに劣っています。さらに、以前の方法では、多くの偽陰性に遭遇します。つまり、別々の患者からの画像とレポートはおそらく同じセマンティクスを持っていますが、誤って陰性として扱われます。この論文では、マルチモーダル対比学習のために画像とテキストを分離し、使用可能なトレーニングデータを低コストで組み合わせ規模でスケーリングします。また、対照学習における偽陰性を排除するために、InfoNCE 損失を医学的知識に基づくセマンティックマッチング損失に置き換えることを提案します。 MedCLIP がシンプルでありながら効果的なフレームワークであることを証明します。ゼロショット予測、教師付き分類、および画像テキスト検索に関する最先端の方法よりも優れています。驚くべきことに、わずか 20K の事前トレーニングデータで、MedCLIP が最先端の方法 (約 200K データを使用) よりも優れていることがわかりました。コードは https://github.com/RyanWangZf/MedCLIP で入手できます。

Existing vision-text contrastive learning like CLIP aims to match the paired image and caption embeddings while pushing others apart, which improves representation transferability and supports zero-shot prediction. However, medical image-text datasets are orders of magnitude below the general images and captions from the internet. Moreover, previous methods encounter many false negatives, i.e., images and reports from separate patients probably carry the same semantics but are wrongly treated as negatives. In this paper, we decouple images and texts for multimodal contrastive learning thus scaling the usable training data in a combinatorial magnitude with low cost. We also propose to replace the InfoNCE loss with semantic matching loss based on medical knowledge to eliminate false negatives in contrastive learning. We prove that MedCLIP is a simple yet effective framework: it outperforms state-of-the-art methods on zero-shot prediction, supervised classification, and image-text retrieval. Surprisingly, we observe that with only 20K pre-training data, MedCLIP wins over the state-of-the-art method (using around 200K data). Our code is available at https://github.com/RyanWangZf/MedCLIP.

updated: Tue Oct 18 2022 21:06:29 GMT+0000 (UTC)

published: Tue Oct 18 2022 21:06:29 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト