Semi-Supervised Active Learning with Temporal Output Discrepancy

Siyu Huang; Tianyang Wang; Haoyi Xiong; Jun Huan; Dejing Dou

時間的出力の不一致を伴う半教師あり能動学習

ディープラーニングはさまざまなタスクで成功しますが、費用と時間がかかる注釈付きデータの大量の収集に大きく依存します。データ注釈のコストを下げるために、オラクルにインタラクティブにクエリを実行して、ラベルのないデータセット内の有益なサンプルのごく一部に注釈を付けるアクティブラーニングが提案されています。損失の大きいサンプルは通常、損失の少ないサンプルよりもモデルに情報を提供するという事実に触発されて、この論文では、ラベルのないサンプルが組み込まれていると思われる場合に、オラクルにデータ注釈を照会する新しいディープアクティブラーニングアプローチを紹介します。高損失。私たちのアプローチの中核は、さまざまな最適化ステップでモデルによって与えられた出力の不一致を評価することによってサンプル損失を推定する測定時間出力不一致（TOD）です。私たちの理論的調査は、TODが累積サンプル損失を下限とするため、有益なラベルのないサンプルを選択するために使用できることを示しています。 TODに基づいて、効果的なラベルなしデータサンプリング戦略と、ラベルなしデータを組み込むことでモデルのパフォーマンスを向上させる教師なし学習基準をさらに開発します。 TODは単純であるため、アクティブラーニングアプローチは効率的で柔軟性があり、タスクに依存しません。広範な実験結果は、私たちのアプローチが画像分類とセマンティックセグメンテーションタスクで最先端のアクティブラーニング手法よりも優れたパフォーマンスを達成することを示しています。

While deep learning succeeds in a wide range of tasks, it highly depends on the massive collection of annotated data which is expensive and time-consuming. To lower the cost of data annotation, active learning has been proposed to interactively query an oracle to annotate a small proportion of informative samples in an unlabeled dataset. Inspired by the fact that the samples with higher loss are usually more informative to the model than the samples with lower loss, in this paper we present a novel deep active learning approach that queries the oracle for data annotation when the unlabeled sample is believed to incorporate high loss. The core of our approach is a measurement Temporal Output Discrepancy (TOD) that estimates the sample loss by evaluating the discrepancy of outputs given by models at different optimization steps. Our theoretical investigation shows that TOD lower-bounds the accumulated sample loss thus it can be used to select informative unlabeled samples. On basis of TOD, we further develop an effective unlabeled data sampling strategy as well as an unsupervised learning criterion that enhances model performance by incorporating the unlabeled data. Due to the simplicity of TOD, our active learning approach is efficient, flexible, and task-agnostic. Extensive experimental results demonstrate that our approach achieves superior performances than the state-of-the-art active learning methods on image classification and semantic segmentation tasks.

updated: Thu Jul 29 2021 16:25:56 GMT+0000 (UTC)

published: Thu Jul 29 2021 16:25:56 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト