Expanding Small-Scale Datasets with Guided Imagination

Yifan Zhang; Daquan Zhou; Bryan Hooi; Kai Wang; Jiashi Feng

ガイド付きの想像力で小規模データセットを拡張する

ディープニューラルネットワーク (DNN) の能力は、トレーニングデータの量、質、および多様性に大きく依存します。ただし、多くの実際のシナリオでは、大規模なデータを収集して注釈を付けるには、コストと時間がかかります。これは、DNN の適用を著しく妨げてきました。この課題に対処するために、データセット拡張の新しいタスクを検討します。これは、小さなデータセットを拡張するために新しいラベル付きサンプルを自動的に作成しようとするものです。この目的のために、最近開発された大きな生成モデル (DALL-E2 など) と再構築モデル (MAE など) を活用して「想像」し、シードデータから有益な新しいデータを作成して拡張する Guided Imagination Framework (GIF) を提示します。小さなデータセット。具体的には、GIF は意味的に意味のある空間でシードデータの潜在的な特徴を最適化することで想像力を働かせ、それを生成モデルに入力して、新しいコンテンツを含む写真のようにリアルな画像を生成します。モデルトレーニングに役立つサンプルを作成するための想像力を導くために、CLIP のゼロショット認識機能を活用し、有益なサンプル生成を促進する 3 つの基準を導入します。つまり、予測の一貫性、エントロピーの最大化、および多様性の促進です。これらの重要な基準をガイダンスとして使用すると、GIF はさまざまなドメインのデータセットを拡張するのにうまく機能し、6 つの自然画像データセットで平均 29.9% の精度向上、3 つの医用画像データセットで平均 12.3% の精度向上につながります。ソースコードは、https://github.com/Vanint/DatasetExpansion で公開されます。

The power of Deep Neural Networks (DNNs) depends heavily on the training data quantity, quality and diversity. However, in many real scenarios, it is costly and time-consuming to collect and annotate large-scale data. This has severely hindered the application of DNNs. To address this challenge, we explore a new task of dataset expansion, which seeks to automatically create new labeled samples to expand a small dataset. To this end, we present a Guided Imagination Framework (GIF) that leverages the recently developed big generative models (e.g., DALL-E2) and reconstruction models (e.g., MAE) to "imagine" and create informative new data from seed data to expand small datasets. Specifically, GIF conducts imagination by optimizing the latent features of seed data in a semantically meaningful space, which are fed into the generative models to generate photo-realistic images with new contents. For guiding the imagination towards creating samples useful for model training, we exploit the zero-shot recognition ability of CLIP and introduce three criteria to encourage informative sample generation, i.e., prediction consistency, entropy maximization and diversity promotion. With these essential criteria as guidance, GIF works well for expanding datasets in different domains, leading to 29.9% accuracy gain on average over six natural image datasets, and 12.3% accuracy gain on average over three medical image datasets. The source code will be released: https://github.com/Vanint/DatasetExpansion.

updated: Thu Dec 08 2022 17:35:16 GMT+0000 (UTC)

published: Fri Nov 25 2022 09:38:22 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト