Hierarchical Knowledge Guided Learning for Real-world Retinal Diseases Recognition

Lie Ju; Zhen Yu; Lin Wang; Xin Zhao; Xin Wang; Paul Bonnington; Zongyuan Ge

実世界の網膜疾患認識のための階層的知識誘導学習

現実の世界では、医療データセットはしばしばロングテールのデータ分布を示します (つまり、いくつかのクラスがデータの大部分を占め、ほとんどのクラスは限られた数のサンプルしか持ちません)。これは困難なロングテール学習シナリオをもたらします。 .眼科 AI で最近公開されたデータセットの中には、複雑な異常とさまざまな罹患率を伴う 40 種類以上の網膜疾患で構成されているものがあります。それにもかかわらず、世界の患者コホートで 30 を超える疾患が見られることはめったにありません。モデリングの観点から見ると、これらのデータセットでトレーニングされたほとんどのディープラーニングモデルは、トレーニング用に提供される利用可能なサンプルがわずかしかない希少疾患に一般化する機能を欠いている可能性があります。さらに、網膜の存在に複数の疾患が存在する可能性があり、その結果、マルチラベルとも呼ばれる困難なラベル共起シナリオが発生し、トレーニング中にいくつかのリサンプリング戦略が適用されると問題が発生する可能性があります。上記の 2 つの主要な課題に対処するために、この論文では、ディープニューラルネットワークがさまざまな網膜疾患認識のためにロングテール眼底データベースから学習できるようにする新しい方法を提示します。まず、眼科の事前知識を活用して、階層を意識した事前トレーニングを使用して特徴表現を改善します。次に、ロングテールの医療データセットシナリオでのラベル共起の問題に対処するために、インスタンスごとのクラスバランスの取れたサンプリング戦略を採用します。第三に、偏りの少ない表現と分類器をトレーニングするために、新しいハイブリッド知識蒸留を導入します。 2 つの公開データセットと 100 万を超える眼底画像を含む 2 つの社内データベースを含む 4 つのデータベースで大規模な実験を行いました。実験結果は、特にこれらの希少疾患に対して、最先端の競合他社よりも優れた認識精度を備えた提案された方法の優位性を示しています。

In the real world, medical datasets often exhibit a long-tailed data distribution (i.e., a few classes occupy the majority of the data, while most classes have only a limited number of samples), which results in a challenging long-tailed learning scenario. Some recently published datasets in ophthalmology AI consist of more than 40 kinds of retinal diseases with complex abnormalities and variable morbidity. Nevertheless, more than 30 conditions are rarely seen in global patient cohorts. From a modeling perspective, most deep learning models trained on these datasets may lack the ability to generalize to rare diseases where only a few available samples are presented for training. In addition, there may be more than one disease for the presence of the retina, resulting in a challenging label co-occurrence scenario, also known as multi-label, which can cause problems when some re-sampling strategies are applied during training. To address the above two major challenges, this paper presents a novel method that enables the deep neural network to learn from a long-tailed fundus database for various retinal disease recognition. Firstly, we exploit the prior knowledge in ophthalmology to improve the feature representation using a hierarchy-aware pre-training. Secondly, we adopt an instance-wise class-balanced sampling strategy to address the label co-occurrence issue under the long-tailed medical dataset scenario. Thirdly, we introduce a novel hybrid knowledge distillation to train a less biased representation and classifier. We conducted extensive experiments on four databases, including two public datasets and two in-house databases with more than one million fundus images. The experimental results demonstrate the superiority of our proposed methods with recognition accuracy outperforming the state-of-the-art competitors, especially for these rare diseases.

updated: Tue Mar 21 2023 06:07:43 GMT+0000 (UTC)

published: Wed Nov 17 2021 05:44:39 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト