One-Vote Veto: A Self-Training Strategy for Low-Shot Learning of a Task-Invariant Embedding to Diagnose Glaucoma

Rui Fan; Christopher Bowd; Nicole Brye; Mark Christopher; Robert N. Weinreb; David Kriegman; Linda Zangwill

One-Vote Veto：緑内障を診断するためのタスク不変埋め込みの低ショット学習のための自己トレーニング戦略

畳み込みニューラルネットワーク（CNN）は、眼底の画像から緑内障を自動診断するための有望な手法であり、これらの画像は、眼科検査の一部として定期的に取得されます。それにもかかわらず、CNNは通常、トレーニングのために大量の適切にラベル付けされたデータを必要とします。これは、特に疾患がまれで、専門家によるラベル付けに費用がかかる場合、多くの生物医学画像分類アプリケーションでは利用できない場合があります。このホワイトペーパーでは、この問題に対処するために2つの貢献をしています。（1）ラベル付けされたデータが制限されて不均衡な場合のローショット学習のための新しいネットワークアーキテクチャとトレーニング方法を紹介し、（2）を使用する新しい半教師あり学習戦略を紹介します。優れた精度を実現するための追加のラベルなしトレーニングデータ。マルチタスクツインニューラルネットワーク（MTTNN）は任意のバックボーンCNNを使用でき、ResNet-50とMobileNet-v2を使用して、限られたトレーニングデータでの精度が、50倍のデータセットでトレーニングされた微調整されたバックボーンの精度に近づくことを示します。。また、MTTNN用に特別に設計された半教師あり学習戦略であるOne-Vote Veto（OVV）セルフトレーニングも紹介します。ラベルなしトレーニングデータの自己予測と対照予測の両方を考慮に入れることにより、OVVセルフトレーニングは、事前トレーニングされたMTTNNを微調整するための追加の疑似ラベルを提供します。 25年間に取得された50,000を超える眼底画像を含む大規模なデータセットを使用して、広範な実験結果は、MTTNNを使用したローショット学習とOVVを使用した半教師あり学習の有効性を示しています。さまざまな条件下（カメラ、機器、場所、母集団）で取得された眼底画像の3つの追加のより小さな臨床データセットを使用して、メソッドの一般化可能性を示します。ソースコードと事前トレーニング済みモデルは、公開時に公開されます。

Convolutional neural networks (CNNs) are a promising technique for automated glaucoma diagnosis from images of the fundus, and these images are routinely acquired as part of an ophthalmic exam. Nevertheless, CNNs typically require a large amount of well-labeled data for training, which may not be available in many biomedical image classification applications, especially when diseases are rare and where labeling by experts is costly. This paper makes two contributions to address this issue: (1) It introduces a new network architecture and training method for low-shot learning when labeled data are limited and imbalanced, and (2) it introduces a new semi-supervised learning strategy that uses additional unlabeled training data to achieve great accuracy. Our multi-task twin neural network (MTTNN) can use any backbone CNN, and we demonstrate with ResNet-50 and MobileNet-v2 that its accuracy with limited training data approaches the accuracy of a finetuned backbone trained with a dataset that is 50 times larger. We also introduce One-Vote Veto (OVV) self-training, a semi-supervised learning strategy, that is designed specifically for MTTNNs. By taking both self-predictions and contrastive-predictions of the unlabeled training data into account, OVV self-training provides additional pseudo labels for finetuning a pretrained MTTNN. Using a large dataset with more than 50,000 fundus images acquired over 25 years, extensive experimental results demonstrate the effectiveness of low-shot learning with MTTNN and semi-supervised learning with OVV. Three additional, smaller clinical datasets of fundus images acquired under different conditions (cameras, instruments, locations, populations), are used to demonstrate generalizability of the methods. Source code and pretrained models will be publicly available upon publication.

updated: Wed Dec 09 2020 03:20:06 GMT+0000 (UTC)

published: Wed Dec 09 2020 03:20:06 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト