Sparse-shot Learning with Exclusive Cross-Entropy for Extremely Many Localisations

Andreas Panteli; Jonas Teuwen; Hugo Horlings; Efstratios Gavves

非常に多くのローカリゼーションのための排他的なクロスエントロピーによるスパースショット学習

オブジェクトのローカリゼーションは、通常の画像のコンテキストでは、人や車などのオブジェクトを表すことがよくあります。これらの画像では、通常、クラスごとに比較的少数のオブジェクトがあり、通常は注釈を付けることができます。しかし、通常の画像の設定以外では、私たちはしばしば別の状況に直面します。計算病理学では、デジタル化された組織切片は非常に大きな画像であり、その寸法はすぐに250'000x250'000ピクセルを超え、腫瘍細胞やリンパ球などの関連するオブジェクトはすぐに数百万に達する可能性があります。それらすべてに注釈を付けることは事実上不可能であり、多くのうちのいくつかをまばらに注釈を付けることが唯一の可能性です。残念ながら、スパースアノテーションからの学習、またはスパースショット学習は、アノテーションが付けられていないものはネガティブとして扱われるため、標準の教師あり学習と衝突します。ただし、真のポジティブであるものにネガティブラベルを割り当てると、勾配の混乱と偏った学習につながります。この目的のために、排他的なクロスエントロピーを提示します。これは、バイアスされた可能性のある項に対応する損失項を削除するために、2次損失導関数を調べることによってバイアスされた学習を遅くします。 9つのデータセットと2つの異なるローカリゼーションタスク（YOLLOによる検出とUnetによるセグメンテーション）での実験では、クロスエントロピーやフォーカルロスと比較して大幅な改善が得られ、モデルのパフォーマンスは10〜40％で最高に達することがよくあります。注釈。

Object localisation, in the context of regular images, often depicts objects like people or cars. In these images, there is typically a relatively small number of objects per class, which usually is manageable to annotate. However, outside the setting of regular images, we are often confronted with a different situation. In computational pathology, digitised tissue sections are extremely large images, whose dimensions quickly exceed 250'000x250'000 pixels, where relevant objects, such as tumour cells or lymphocytes can quickly number in the millions. Annotating them all is practically impossible and annotating sparsely a few, out of many more, is the only possibility. Unfortunately, learning from sparse annotations, or sparse-shot learning, clashes with standard supervised learning because what is not annotated is treated as a negative. However, assigning negative labels to what are true positives leads to confusion in the gradients and biased learning. To this end, we present exclusive cross-entropy, which slows down the biased learning by examining the second-order loss derivatives in order to drop the loss terms corresponding to likely biased terms. Experiments on nine datasets and two different localisation tasks, detection with YOLLO and segmentation with Unet, show that we obtain considerable improvements compared to cross-entropy or focal loss, while often reaching the best possible performance for the model with only 10-40% of annotations.

updated: Mon Aug 23 2021 11:17:34 GMT+0000 (UTC)

published: Wed Apr 21 2021 09:09:54 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト