Semi-supervised Deep Learning for Image Classification with Distribution Mismatch: A Survey

Saul Calderon-Ramirez; Shengxiang Yang; David Elizondo

分布の不一致を伴う画像分類のための半教師あり深層学習：調査

深層学習の方法論はいくつかの異なる分野で採用されており、材料品質管理、医用画像、自動運転などの画像認識アプリケーションで目覚ましい成功を収めています。深層学習モデルは、将来のモデルをトレーニングするために豊富なラベル付き観測に依存しています。これらのモデルは、推定する数百万のパラメーターで構成されているため、より多くのトレーニング観測の必要性が高まります。多くの場合、データのラベル付きの観測値を収集するにはコストがかかり、モデルがデータに適合しすぎる可能性があるため、深層学習モデルの使用は理想的ではありません。半教師あり設定では、ラベルなしデータを使用して、小さなラベル付きデータセットを使用したモデルの精度と一般化のレベルを向上させます。それでも、多くの場合、ラベルのないさまざまなデータソースを利用できる可能性があります。これにより、ラベル付きデータセットとラベルなしデータセットの間で分布が大幅に一致しないリスクが高まります。このような現象は、典型的な半教師あり深層学習フレームワークにかなりのパフォーマンスの打撃を与える可能性があります。これは、ラベル付きデータセットとラベルなしデータセットの両方が同様の分布から抽出されていると想定することがよくあります。したがって、この論文では、画像認識のための半教師あり深層学習の最新のアプローチを研究します。ラベル付きデータセットとラベルなしデータセットの間の分布の不一致に対処するように設計された半教師あり深層学習モデルに重点が置かれています。コミュニティがそれらに取り組むことを奨励し、実際の使用設定の下で従来の深層学習パイプラインの高いデータ需要を克服することを目的として、未解決の課題に対処します。

Deep learning methodologies have been employed in several different fields, with an outstanding success in image recognition applications, such as material quality control, medical imaging, autonomous driving, etc. Deep learning models rely on the abundance of labelled observations to train a prospective model. These models are composed of millions of parameters to estimate, increasing the need of more training observations. Frequently it is expensive to gather labelled observations of data, making the usage of deep learning models not ideal, as the model might over-fit data. In a semi-supervised setting, unlabelled data is used to improve the levels of accuracy and generalization of a model with small labelled datasets. Nevertheless, in many situations different unlabelled data sources might be available. This raises the risk of a significant distribution mismatch between the labelled and unlabelled datasets. Such phenomena can cause a considerable performance hit to typical semi-supervised deep learning frameworks, which often assume that both labelled and unlabelled datasets are drawn from similar distributions. Therefore, in this paper we study the latest approaches for semi-supervised deep learning for image recognition. Emphasis is made in semi-supervised deep learning models designed to deal with a distribution mismatch between the labelled and unlabelled datasets. We address open challenges with the aim to encourage the community to tackle them, and overcome the high data demand of traditional deep learning pipelines under real-world usage settings.

updated: Thu Mar 10 2022 15:54:29 GMT+0000 (UTC)

published: Tue Mar 01 2022 02:46:00 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト