Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods

Guo-Jun Qi; Jiebo Luo

ビッグデータ時代のスモールデータの課題：教師なしおよび半教師ありの方法に関する最近の進歩の調査

ディープニューラルネットワークの成功は、収集に費用がかかる大量のラベル付きデータの可用性に依存することが多いため、小さなラベル付きデータを使用した表現学習が多くの問題で発生しています。これに対処するために、教師なしおよび半教師ありの方法で、ラベル付けされたデータがほとんどない高度なモデルをトレーニングするために多くの努力が払われてきました。このホワイトペーパーでは、これら2つの主要なカテゴリのメソッドの最近の進歩を確認します。幅広いモデルが全体像に分類され、新しいアイデアの探求を動機付けるためにそれらがどのように相互作用するかを示します。変換の同変、解きほぐされた、自己監視あり、半教師あり表現の学習の原則を確認します。これらはすべて、最近の進歩の基盤を支えています。教師なしおよび半教師あり生成モデルの多くの実装は、これらの基準に基づいて開発されており、より強力な表現のためにラベルなしデータの分布を調査することにより、既存のオートエンコーダー、生成的敵対的ネット（GAN）およびその他の深いネットワークの領域を大幅に拡大しています。教師なし学習と半教師あり学習の本質的な関係を明らかにすることで新たなトピックについて説明し、将来の方向性として、教師なし学習の変換同変と教師あり学習の教師あり不変性の間のアルゴリズム的および理論的ギャップを埋め、教師なし事前トレーニングと教師あり微調整を統合することを提案します。。また、表現学習の変換とインスタンスの等分散性を統合し、教師なしおよび半教師ありの拡張を接続し、多くの学習問題に対する自己教師あり正則化の役割を調査するための将来の方向性について、より広い展望を提供します。

Representation learning with small labeled data have emerged in many problems, since the success of deep neural networks often relies on the availability of a huge amount of labeled data that is expensive to collect. To address it, many efforts have been made on training sophisticated models with few labeled data in an unsupervised and semi-supervised fashion. In this paper, we will review the recent progresses on these two major categories of methods. A wide spectrum of models will be categorized in a big picture, where we will show how they interplay with each other to motivate explorations of new ideas. We will review the principles of learning the transformation equivariant, disentangled, self-supervised and semi-supervised representations, all of which underpin the foundation of recent progresses. Many implementations of unsupervised and semi-supervised generative models have been developed on the basis of these criteria, greatly expanding the territory of existing autoencoders, generative adversarial nets (GANs) and other deep networks by exploring the distribution of unlabeled data for more powerful representations. We will discuss emerging topics by revealing the intrinsic connections between unsupervised and semi-supervised learning, and propose in future directions to bridge the algorithmic and theoretical gap between transformation equivariance for unsupervised learning and supervised invariance for supervised learning, and unify unsupervised pretraining and supervised finetuning. We will also provide a broader outlook of future directions to unify transformation and instance equivariances for representation learning, connect unsupervised and semi-supervised augmentations, and explore the role of the self-supervised regularization for many learning problems.

updated: Sat Jan 02 2021 16:11:08 GMT+0000 (UTC)

published: Wed Mar 27 2019 05:50:28 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト