RGMIM: Region-Guided Masked Image Modeling for Learning Meaningful Representation from X-Ray Images

Guang Li; Ren Togo; Takahiro Ogawa; Miki Haseyama

RGMIM: X 線画像から意味のある表現を学習するための領域ガイド付きマスク画像モデリング

目的: 自己教師あり学習は、コンピューター支援診断を改善する可能性があるため、医療分野で注目を集めています。自己教師あり学習の一般的な方法の 1 つはマスクイメージモデリング (MIM) です。これには、入力ピクセルのサブセットをマスクし、マスクされたピクセルを予測することが含まれます。ただし、従来の MIM 方法では通常、ランダムマスキング戦略が使用されており、疾患検出の対象領域が狭い医療画像には理想的ではない可能性があります。この問題に対処するために、この研究は医療画像用の MIM を改善し、オープン X 線画像データセットでのその有効性を評価することを目的としています。方法: この論文では、X 線画像から意味のある表現を学習するための領域誘導マスク画像モデリング (RGMIM) と呼ばれる新しい方法を紹介します。私たちの方法は、器官マスク情報を利用して、より意味のある表現を学習するための有効な領域を識別する新しいマスキング戦略を採用しています。提案された方法は、5 つの自己教師あり学習手法 (MAE、SKD、Cross、BYOL、および SimSiam) と対比されました。オープン肺 X 線画像データセットとマスキング率ハイパーパラメータ研究の定量的評価を実施します。結果: トレーニングセット全体を使用した場合、RGMIM は他の同等の方法を上回り、0.962 の肺疾患検出精度を達成しました。具体的には、RGMIM は他の方法と比較して、トレーニングセットの 5% および 10% (画像 846 枚と 1,693 枚) などの少量のデータ量でパフォーマンスを大幅に向上させ、トレーニングセットの 50% のみが使用された場合でも 0.957 の検出精度を達成しました。。結論: RGMIM はより有効な領域をマスクすることができ、識別表現の学習とその後の高精度の肺疾患検出を容易にします。 RGMIM は、特に限られたトレーニングデータを使用する場合、実験において他の最先端の自己教師あり学習手法よりも優れたパフォーマンスを発揮します。

Purpose: Self-supervised learning has been gaining attention in the medical field for its potential to improve computer-aided diagnosis. One popular method of self-supervised learning is masked image modeling (MIM), which involves masking a subset of input pixels and predicting the masked pixels. However, traditional MIM methods typically use a random masking strategy, which may not be ideal for medical images that often have a small region of interest for disease detection. To address this issue, this work aims to improve MIM for medical images and evaluate its effectiveness in an open X-ray image dataset. Methods: In this paper, we present a novel method called region-guided masked image modeling (RGMIM) for learning meaningful representation from X-ray images. Our method adopts a new masking strategy that utilizes organ mask information to identify valid regions for learning more meaningful representations. The proposed method was contrasted with five self-supervised learning techniques (MAE, SKD, Cross, BYOL, and, SimSiam). We conduct quantitative evaluations on an open lung X-ray image dataset as well as masking ratio hyperparameter studies. Results: When using the entire training set, RGMIM outperformed other comparable methods, achieving a 0.962 lung disease detection accuracy. Specifically, RGMIM significantly improved performance in small data volumes, such as 5% and 10% of the training set (846 and 1,693 images) compared to other methods, and achieved a 0.957 detection accuracy even when only 50% of the training set was used. Conclusions: RGMIM can mask more valid regions, facilitating the learning of discriminative representations and the subsequent high-accuracy lung disease detection. RGMIM outperforms other state-of-the-art self-supervised learning methods in experiments, particularly when limited training data is used.

updated: Sun May 21 2023 14:36:59 GMT+0000 (UTC)

published: Tue Nov 01 2022 07:41:03 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト