Exploring Continual Learning of Diffusion Models

Michał Zając; Kamil Deja; Anna Kuzina; Jakub M. Tomczak; Tomasz Trzciński; Florian Shkurti; Piotr Miłoś

拡散モデルの継続的学習の探索

拡散モデルは、前例のない量のデータに適用される新しいトレーニング手順のおかげで、高品質の画像を生成するという驚くべき成功を収めました。ただし、拡散モデルをゼロからトレーニングすると、計算コストが高くなります。これは、これらのモデルを反復的にトレーニングし、データ分布が変化している間に計算を再利用する可能性を調査する必要性を強調しています。この研究では、この方向への第一歩を踏み出し、拡散モデルの継続的学習 (CL) 特性を評価します。まず、ノイズ除去拡散確率モデル (DDPM) に適用される最も一般的な CL メソッドのベンチマークを行います。ここでは、リハーサル係数を減らした経験リプレイの強力なパフォーマンスに注目します。さらに、拡散タイムステップ全体で多様な動作を示す忘却のダイナミクスへの洞察を提供します。また、CL を評価するために次元あたりのビット数メトリックを使用する場合の特定の落とし穴も明らかにします。

Diffusion models have achieved remarkable success in generating high-quality images thanks to their novel training procedures applied to unprecedented amounts of data. However, training a diffusion model from scratch is computationally expensive. This highlights the need to investigate the possibility of training these models iteratively, reusing computation while the data distribution changes. In this study, we take the first step in this direction and evaluate the continual learning (CL) properties of diffusion models. We begin by benchmarking the most common CL methods applied to Denoising Diffusion Probabilistic Models (DDPMs), where we note the strong performance of the experience replay with the reduced rehearsal coefficient. Furthermore, we provide insights into the dynamics of forgetting, which exhibit diverse behavior across diffusion timesteps. We also uncover certain pitfalls of using the bits-per-dimension metric for evaluating CL.

updated: Mon Mar 27 2023 15:52:14 GMT+0000 (UTC)

published: Mon Mar 27 2023 15:52:14 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト