LIMITR: Leveraging Local Information for Medical Image-Text Representation

Gefen Dawidowicz; Elad Hirsch; Ayellet Tal

LIMITR: 医療画像テキスト表現のためのローカル情報の活用

医用画像解析は、さまざまな病状の診断と治療において重要な役割を果たします。この論文では、胸部 X 線画像とそれに対応する放射線レポートに焦点を当てています。これは、ジョイント X 線画像とレポート表現を学習する新しいモデルを提示します。このモデルは、ローカル情報とグローバル情報の両方を考慮に入れた、ビジュアルデータとテキストの間の斬新な配置スキームに基づいています。さらに、このモデルは、側面画像と胸部画像の一貫した視覚構造の 2 種類のドメイン固有の情報を統合します。私たちの表現は、テキスト画像検索、クラスベースの検索、フレーズグラウンディングの 3 種類の検索タスクに役立つことが示されています。

Medical imaging analysis plays a critical role in the diagnosis and treatment of various medical conditions. This paper focuses on chest X-ray images and their corresponding radiological reports. It presents a new model that learns a joint X-ray image & report representation. The model is based on a novel alignment scheme between the visual data and the text, which takes into account both local and global information. Furthermore, the model integrates domain-specific information of two types -- lateral images and the consistent visual structure of chest images. Our representation is shown to benefit three types of retrieval tasks: text-image retrieval, class-based retrieval, and phrase-grounding.

updated: Tue Mar 21 2023 11:20:34 GMT+0000 (UTC)

published: Tue Mar 21 2023 11:20:34 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト