Deep Neural Networks Learn Meta-Structures from Noisy Labels in Semantic Segmentation

Yaoru Luo; Guole Liu; Wenjing Li; Yuanhao Guo; Ge Yang

ディープニューラルネットワークは、セマンティックセグメンテーションでノイズの多いラベルからメタ構造を学習します

ノイズの多いラベルからディープニューラルネットワーク（DNN）がどのように学習するかは、画像分類では広く研究されていますが、画像セグメンテーションではほとんど研究されていません。これまでのところ、ノイズの多いセグメンテーションラベルによってトレーニングされたDNNの学習動作の理解は限られたままです。この研究では、生物学的顕微鏡画像のバイナリセグメンテーションと自然画像のマルチクラスセグメンテーションの両方でこの欠陥に対処します。グラウンドトゥルースラベルの小さな部分（たとえば、10％）をランダムにサンプリングするか、大きな部分（たとえば、90％）を反転することによって、非常にノイズの多いラベルを生成します。これらのノイズの多いラベルでトレーニングされた場合、DNNは、元のグラウンドトゥルースによってトレーニングされたものとほぼ同じセグメンテーションパフォーマンスを提供します。これは、DNNが、セマンティックセグメンテーションの教師ありトレーニングで、ピクセルレベルのラベル自体ではなくラベルに隠された構造を学習することを示しています。ラベル内のこれらの隠された構造をメタ構造と呼びます。 DNNがメタ構造に対して異なる摂動を持つラベルによってトレーニングされると、セグメンテーションパフォーマンスが一貫して低下することがわかります。対照的に、メタ構造情報を組み込むと、バイナリセマンティックセグメンテーション用に開発された教師なしセグメンテーションモデルのパフォーマンスが大幅に向上します。メタ構造を空間的な点分布として数学的に定義し、この定式化がDNNの主要な観察された学習行動をどのように説明するかを理論的および実験的に示します。

How deep neural networks (DNNs) learn from noisy labels has been studied extensively in image classification but much less in image segmentation. So far, our understanding of the learning behavior of DNNs trained by noisy segmentation labels remains limited. In this study, we address this deficiency in both binary segmentation of biological microscopy images and multi-class segmentation of natural images. We generate extremely noisy labels by randomly sampling a small fraction (e.g., 10%) or flipping a large fraction (e.g., 90%) of the ground truth labels. When trained with these noisy labels, DNNs provide largely the same segmentation performance as trained by the original ground truth. This indicates that DNNs learn structures hidden in labels rather than pixel-level labels per se in their supervised training for semantic segmentation. We refer to these hidden structures in labels as meta-structures. When DNNs are trained by labels with different perturbations to the meta-structure, we find consistent degradation in their segmentation performance. In contrast, incorporation of meta-structure information substantially improves performance of an unsupervised segmentation model developed for binary semantic segmentation. We define meta-structures mathematically as spatial point distributions and show both theoretically and experimentally how this formulation explains key observed learning behavior of DNNs.

updated: Wed Dec 01 2021 01:12:36 GMT+0000 (UTC)

published: Mon Mar 22 2021 05:43:34 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト