Censored Sampling of Diffusion Models Using 3 Minutes of Human Feedback

TaeHo Yoon; Kibeom Myoung; Keon Lee; Jaewoong Cho; Albert No; Ernest K. Ryu

3 分間の人的フィードバックを使用した拡散モデルの検閲済みサンプリング

普及モデルは、近年、高画質画像の生成において目覚ましい成功を収めています。ただし、事前にトレーニングされた拡散モデルは、モデルが良好な画像を生成できるという意味で部分的な位置ずれを示すことがありますが、場合によっては望ましくない画像が出力されることがあります。そうであれば、悪い画像の生成を防ぐ必要があるだけであり、このタスクを検閲と呼びます。この研究では、人間による最小限のフィードバックでトレーニングされた報酬モデルを使用した、事前トレーニングされた拡散モデルによる打ち切り生成を示します。我々は、人間による極めて効率的なフィードバックによって検閲が達成できること、そしてわずか数分の人間によるフィードバックで生成されたラベルで十分であることを示します。コードは https://github.com/tetrzim/diffusion-human-フィードバックで入手できます。

Diffusion models have recently shown remarkable success in high-quality image generation. Sometimes, however, a pre-trained diffusion model exhibits partial misalignment in the sense that the model can generate good images, but it sometimes outputs undesirable images. If so, we simply need to prevent the generation of the bad images, and we call this task censoring. In this work, we present censored generation with a pre-trained diffusion model using a reward model trained on minimal human feedback. We show that censoring can be accomplished with extreme human feedback efficiency and that labels generated with a mere few minutes of human feedback are sufficient. Code available at: https://github.com/tetrzim/diffusion-human-feedback.

updated: Thu Jul 06 2023 04:45:14 GMT+0000 (UTC)

published: Thu Jul 06 2023 04:45:14 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト