GAN Cocktail: mixing GANs without dataset access

Omri Avrahami; Dani Lischinski; Ohad Fried

GAN カクテル: データセットにアクセスせずに GAN を混合する

今日の生成モデルは忠実度の高い画像を合成できますが、各モデルは特定のターゲットドメインに特化しています。これにより、モデルマージの必要性が高まります。つまり、2 つ以上の事前トレーニング済み生成モデルを 1 つの統合モデルに結合します。この作業では、現実の世界でよく発生する 2 つの制約 (1) 元のトレーニングデータへのアクセスがないこと、および (2) ニューラルネットワークのサイズを大きくしないことの 2 つの制約を考慮して、モデルのマージの問題に取り組みます。私たちの知る限り、これらの制約の下でのモデルのマージはこれまで研究されていません。私たちは、新しい 2 段階のソリューションを提案します。最初の段階では、モデルルート化と呼ばれる手法によって、すべてのモデルの重みを同じパラメーター空間に変換します。 2 番目の段階では、元のトレーニング済みモデルによって生成されたデータのみを使用して、重みを平均し、特定のドメインごとに微調整することにより、ルート化されたモデルをマージします。私たちのアプローチがベースライン手法や既存の転移学習手法よりも優れていることを示し、いくつかのアプリケーションを調査します。

Today's generative models are capable of synthesizing high-fidelity images, but each model specializes on a specific target domain. This raises the need for model merging: combining two or more pretrained generative models into a single unified one. In this work we tackle the problem of model merging, given two constraints that often come up in the real world: (1) no access to the original training data, and (2) without increasing the size of the neural network. To the best of our knowledge, model merging under these constraints has not been studied thus far. We propose a novel, two-stage solution. In the first stage, we transform the weights of all the models to the same parameter space by a technique we term model rooting. In the second stage, we merge the rooted models by averaging their weights and fine-tuning them for each specific domain, using only data generated by the original trained models. We demonstrate that our approach is superior to baseline methods and to existing transfer learning techniques, and investigate several applications.

updated: Mon Jun 07 2021 17:59:04 GMT+0000 (UTC)

published: Mon Jun 07 2021 17:59:04 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト