BoMD: Bag of Multi-label Descriptors for Noisy Chest X-ray Classification

Yuanhong Chen; Fengbei Liu; Hu Wang; Chong Wang; Yu Tian; Yuyuan Liu; Gustavo Carneiro

BoMD: ノイズの多い胸部 X 線分類のためのマルチラベル記述子のバッグ

ディープラーニング手法は、医用画像の問題で卓越した分類精度を示しています。これは、きれいなラベルで手動で注釈を付けた大規模なデータセットが利用できることに大きく起因しています。ただし、このような手動注釈のコストが高いことを考えると、新しい医用画像分類の問題は、放射線レポートから抽出された機械生成のノイズの多いラベルに依存する必要がある場合があります。実際、多くの胸部 X 線 (CXR) 分類器は、ノイズの多いラベルを持つデータセットから既にモデル化されていますが、それらのトレーニング手順は通常、ノイズの多いラベルのサンプルに対して堅牢ではなく、最適ではないモデルにつながります。さらに、CXR データセットはほとんどがマルチラベルであるため、マルチクラスの問題用に設計された現在のノイジーラベル学習方法を簡単に適用することはできません。この論文では、ノイズの多いマルチラベル CXR 学習用に設計された新しい方法を提案します。これは、データセットからサンプルを検出してスムーズに再ラベル付けし、一般的なマルチラベル分類器のトレーニングに使用します。提案された方法は、マルチラベル画像注釈からBERTモデルによって生成されたセマンティック記述子との類似性を促進するために、マルチラベル記述子（BoMD）のバッグを最適化します。さまざまなノイズの多いマルチラベルトレーニングセットとクリーンなテストセットに関する実験では、モデルが多くの CXR マルチラベル分類ベンチマークで最先端の精度と堅牢性を備えていることが示されています。

Deep learning methods have shown outstanding classification accuracy in medical imaging problems, which is largely attributed to the availability of large-scale datasets manually annotated with clean labels. However, given the high cost of such manual annotation, new medical imaging classification problems may need to rely on machine-generated noisy labels extracted from radiology reports. Indeed, many Chest X-ray (CXR) classifiers have already been modelled from datasets with noisy labels, but their training procedure is in general not robust to noisy-label samples, leading to sub-optimal models. Furthermore, CXR datasets are mostly multi-label, so current noisy-label learning methods designed for multi-class problems cannot be easily adapted. In this paper, we propose a new method designed for the noisy multi-label CXR learning, which detects and smoothly re-labels samples from the dataset, which is then used to train common multi-label classifiers. The proposed method optimises a bag of multi-label descriptors (BoMD) to promote their similarity with the semantic descriptors produced by BERT models from the multi-label image annotation. Our experiments on diverse noisy multi-label training sets and clean testing sets show that our model has state-of-the-art accuracy and robustness in many CXR multi-label classification benchmarks.

updated: Sun Jul 30 2023 07:03:15 GMT+0000 (UTC)

published: Thu Mar 03 2022 08:04:59 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト