Highly Efficient Representation and Active Learning Framework and Its Application to Imbalanced Medical Image Classification

Heng Hao; Hankyu Moon; Sima Didari; Jae Oh Woo; Patrick Bangert

非常に効率的な表現と能動学習フレームワークおよび不均衡な医用画像分類へのその応用

画像分類のためのデータ効率の高い能動学習フレームワークを提案します。私たちの新しいフレームワークは、（1）畳み込みニューラルネットワークの教師なし表現学習と（2）ガウス過程（GP）法を順番に組み合わせて、高度なデータとラベル効率の高い分類を実現します。さらに、両方の要素は、（1）ラベルなしで学習された機能、および（2）GPのベイズ的性質のおかげで、一般的で困難なクラスの不均衡の問題に対する感度が低くなります。 GPが提供する不確実性の推定値は、不確実性に基づいてサンプルをランク付けし、より高い不確実性を示すサンプルに選択的にラベルを付けることにより、能動学習を可能にします。この新しい組み合わせを、COVID-19胸部X線分類とネルトゥス結腸内視鏡検査分類のひどく不均衡な症例に適用します。のみを示します。利用可能なすべてのラベルをトレーニングして精度を上げるには、ラベル付きデータの10％が必要です。また、モデルアーキテクチャと提案されたフレームワークを、より幅広いクラスのデータセットに適用し、成功が期待されました。

We propose a highly data-efficient active learning framework for image classification. Our novel framework combines: (1) unsupervised representation learning of a Convolutional Neural Network and (2) the Gaussian Process (GP) method, in sequence to achieve highly data and label efficient classifications. Moreover, both elements are less sensitive to the prevalent and challenging class imbalance issue, thanks to the (1) feature learned without labels and (2) the Bayesian nature of GP. The GP-provided uncertainty estimates enable active learning by ranking samples based on the uncertainty and selectively labeling samples showing higher uncertainty. We apply this novel combination to the severely imbalanced case of COVID-19 chest X-ray classification and the Nerthus colonoscopy classification. We demonstrate that only . 10% of the labeled data is needed to reach the accuracy from training all available labels. We also applied our model architecture and proposed framework to a broader class of datasets with expected success.

updated: Mon Jun 20 2022 21:40:50 GMT+0000 (UTC)

published: Thu Feb 25 2021 02:48:59 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト