Mining Data Impressions from Deep Models as Substitute for the Unavailable Training Data

Gaurav Kumar Nayak; Konda Reddy Mopuri; Saksham Jain; Anirban Chakraborty

利用できないトレーニングデータの代わりとしてのディープモデルからのマイニングデータインプレッション

事前にトレーニングされたディープモデルは、学習した知識をモデルパラメータの形で保持します。これらのパラメータは、トレーニングされたモデルの「メモリ」として機能し、見えないデータを適切に一般化するのに役立ちます。ただし、トレーニングデータがない場合、トレーニングされたモデルの有用性は、ターゲットタスクに対する推論またはより適切な初期化のいずれかに限定されます。この論文では、さらに進んで、学習したモデルパラメータを活用して合成データを抽出します。トレーニングデータのプロキシとして機能し、さまざまなタスクを実現するために使用できる「データインプレッション」と名付けました。これらは、事前にトレーニングされたモデルのみが利用可能であり、トレーニングデータが共有されていないシナリオで役立ちます（たとえば、プライバシーや機密性の懸念のため）。教師なしドメインの適応、継続的な学習、知識の蒸留など、いくつかのコンピュータービジョンタスクを解決する際のデータインプレッションの適用可能性を示します。また、これらのデータインプレッションを使用して、知識蒸留によってトレーニングされた軽量モデルの敵対的な堅牢性についても調査します。さらに、より良いだまし率でデータフリーのユニバーサル敵対摂動（UAP）を生成する際のデータインプレッションの有効性を示します。ベンチマークデータセットで実行された広範な実験は、元のトレーニングデータがない場合にデータインプレッションを使用して達成された競争力のあるパフォーマンスを示しています。

Pretrained deep models hold their learnt knowledge in the form of model parameters. These parameters act as "memory" for the trained models and help them generalize well on unseen data. However, in absence of training data, the utility of a trained model is merely limited to either inference or better initialization towards a target task. In this paper, we go further and extract synthetic data by leveraging the learnt model parameters. We dub them "Data Impressions", which act as proxy to the training data and can be used to realize a variety of tasks. These are useful in scenarios where only the pretrained models are available and the training data is not shared (e.g., due to privacy or sensitivity concerns). We show the applicability of data impressions in solving several computer vision tasks such as unsupervised domain adaptation, continual learning as well as knowledge distillation. We also study the adversarial robustness of lightweight models trained via knowledge distillation using these data impressions. Further, we demonstrate the efficacy of data impressions in generating data-free Universal Adversarial Perturbations (UAPs) with better fooling rates. Extensive experiments performed on benchmark datasets demonstrate competitive performance achieved using data impressions in absence of original training data.

updated: Mon Aug 30 2021 13:53:43 GMT+0000 (UTC)

published: Fri Jan 15 2021 11:37:29 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト