We explore the value of weak labels in learning transferable representations for medical images. Compared to hand-labeled datasets, weak or inexact labels can be acquired in large quantities at significantly lower cost and can provide useful training signals for data-hungry models such as deep neural networks. We consider weak labels in the form of pseudo-labels and propose a semi-weakly supervised contrastive learning (SWCL) framework for representation learning using semi-weakly annotated images. Specifically, we train a semi-supervised model to propagate labels from a small dataset consisting of diverse image-level annotations to a large unlabeled dataset. Using the propagated labels, we generate a patch-level dataset for pretraining and formulate a multi-label contrastive learning objective to capture position-specific features encoded in each patch. We empirically validate the transfer learning performance of SWCL on seven public retinal fundus datasets, covering three disease classification tasks and two anatomical structure segmentation tasks. Our experiment results suggest that, under very low data regime, large-scale ImageNet pretraining on improved architecture remains a very strong baseline, and recently proposed self-supervised methods falter in segmentation tasks, possibly due to the strong invariant constraint imposed. Our method surpasses all prior self-supervised methods and standard cross-entropy training, while closing the gaps with ImageNet pretraining.
updated: Wed Aug 04 2021 15:50:09 GMT+0000 (UTC)
published: Wed Aug 04 2021 15:50:09 GMT+0000 (UTC)