The CLEAR Benchmark: Continual LEArning on Real-World Imagery

Zhiqiu Lin; Jia Shi; Deepak Pathak; Deva Ramanan

明確なベンチマーク：実世界の画像に関する継続的な学習

継続学習（CL）は、生涯AIにとって重要な課題と広く見なされています。ただし、既存のCLベンチマーク（Permuted-MNISTやSplit-CIFARなど）は、人為的な時間的変動を利用しており、現実の世界と一致したり、一般化したりすることはありません。この論文では、CLEARを紹介します。これは、10年（2004年から2014年）にわたる現実世界での視覚的概念の自然な時間的進化を伴う最初の連続画像分類ベンチマークデータセットです。 CLEARは、既存の大規模な画像コレクション（YFCC100M）から、Visio-言語データセットのキュレーションに対する斬新でスケーラブルな低コストのアプローチを通じて構築されます。私たちのパイプラインは、事前にトレーニングされた視覚言語モデル（CLIPなど）を利用して、ラベル付けされたデータセットをインタラクティブに構築します。クラウドソーシングでさらに検証され、エラーや不適切な画像（元のYFCC100Mに隠されている）を削除します。以前のCLベンチマークに対するCLEARの主な強みは、継続的な半教師あり学習のための期間ごとの豊富なラベルなしサンプルに加えて、高品質のラベル付きデータの両方を含む、実世界の画像による視覚的概念のスムーズな時間的進化です。単純な教師なし事前トレーニングステップにより、完全に教師ありデータのみを利用する最先端のCLアルゴリズムをすでに強化できることがわかりました。私たちの分析はまた、iidデータをトレーニングおよびテストする主流のCL評価プロトコルがCLシステムのパフォーマンスを人為的に膨らませることを明らかにしています。これに対処するために、CLの新しい「ストリーミング」プロトコルを提案します。このプロトコルは常に（近い）将来をテストします。興味深いことに、ストリーミングプロトコルは、（a）今日のテストセットを明日のトレインセットに再利用できるためデータセットのキュレーションを簡素化でき、（b）各期間のすべてのラベル付きデータがトレーニングとテスト（従来のiidトレインテスト分割とは異なります）。

Continual learning (CL) is widely regarded as crucial challenge for lifelong AI. However, existing CL benchmarks, e.g. Permuted-MNIST and Split-CIFAR, make use of artificial temporal variation and do not align with or generalize to the real-world. In this paper, we introduce CLEAR, the first continual image classification benchmark dataset with a natural temporal evolution of visual concepts in the real world that spans a decade (2004-2014). We build CLEAR from existing large-scale image collections (YFCC100M) through a novel and scalable low-cost approach to visio-linguistic dataset curation. Our pipeline makes use of pretrained vision-language models (e.g. CLIP) to interactively build labeled datasets, which are further validated with crowd-sourcing to remove errors and even inappropriate images (hidden in original YFCC100M). The major strength of CLEAR over prior CL benchmarks is the smooth temporal evolution of visual concepts with real-world imagery, including both high-quality labeled data along with abundant unlabeled samples per time period for continual semi-supervised learning. We find that a simple unsupervised pre-training step can already boost state-of-the-art CL algorithms that only utilize fully-supervised data. Our analysis also reveals that mainstream CL evaluation protocols that train and test on iid data artificially inflate performance of CL system. To address this, we propose novel "streaming" protocols for CL that always test on the (near) future. Interestingly, streaming protocols (a) can simplify dataset curation since today's testset can be repurposed for tomorrow's trainset and (b) can produce more generalizable models with more accurate estimates of performance since all labeled data from each time-period is used for both training and testing (unlike classic iid train-test splits).

updated: Mon Jan 17 2022 09:09:09 GMT+0000 (UTC)

published: Mon Jan 17 2022 09:09:09 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト