Measuring the Success of Diffusion Models at Imitating Human Artists

Stephen Casper; Zifan Guo; Shreya Mogulothu; Zachary Marinov; Chinmay Deshpande; Rui-Jie Yew; Zheng Dai; Dylan Hadfield-Menell

人間のアーティストを模倣する際の普及モデルの成功の測定

最新の普及モデルは、AI 画像生成における最先端の技術を確立しました。彼らの成功の一部は、著作権で保護された作品が含まれることが多いインターネット規模のデータでトレーニングしたことによるものです。このことから、これらのモデルが人間のアーティストの作品からどの程度学び、模倣し、コピーしているのかという疑問が生じます。この研究は、生成モデルのエコシステムが進化していることを考慮すると、著作権責任をモデルの機能に結び付けることが有用である可能性があることを示唆しています。具体的には、著作権および生成システムの法的分析の多くは、トレーニングのための保護されたデータの使用に焦点を当てています。その結果、データ、トレーニング、システム間のつながりが曖昧になることがよくあります。私たちのアプローチでは、特定のアーティストを模倣するモデルの能力を測定するための単純な画像分類手法を考慮します。具体的には、Contrastive Language-Image Pretrained (CLIP) エンコーダーを使用して、ゼロショット方式で画像を分類します。私たちのプロセスでは、まずモデルに特定のアーティストを模倣するよう促します。次に、CLIP を使用してアーティスト (またはアーティストの作品) を模倣から再分類できるかどうかをテストします。これらのテストで模倣が元のアーティストと一致する場合、モデルがそのアーティストの表現を模倣できることが示唆されます。私たちのアプローチはシンプルかつ定量的です。さらに、標準的なテクニックを使用するため、追加のトレーニングは必要ありません。私たちは、著作権で保護された作品をオンラインで提供する 70 人のプロのデジタルアーティストを模倣する Stable Diffusion の能力を監査することで、私たちのアプローチを実証します。 Stable Diffusion がこのセットのアーティストを模倣するように指示された場合、平均 81.0% の精度で模倣からアーティストを識別できることがわかります。最後に、アーティストの作品のサンプルが統計的に高い信頼性でこれらの模倣画像と照合できることも示します。全体として、これらの結果は、安定した拡散が個々の人間のアーティストを模倣することに広く成功していることを示唆しています。

Modern diffusion models have set the state-of-the-art in AI image generation. Their success is due, in part, to training on Internet-scale data which often includes copyrighted work. This prompts questions about the extent to which these models learn from, imitate, or copy the work of human artists. This work suggests that tying copyright liability to the capabilities of the model may be useful given the evolving ecosystem of generative models. Specifically, much of the legal analysis of copyright and generative systems focuses on the use of protected data for training. As a result, the connections between data, training, and the system are often obscured. In our approach, we consider simple image classification techniques to measure a model's ability to imitate specific artists. Specifically, we use Contrastive Language-Image Pretrained (CLIP) encoders to classify images in a zero-shot fashion. Our process first prompts a model to imitate a specific artist. Then, we test whether CLIP can be used to reclassify the artist (or the artist's work) from the imitation. If these tests match the imitation back to the original artist, this suggests the model can imitate that artist's expression. Our approach is simple and quantitative. Furthermore, it uses standard techniques and does not require additional training. We demonstrate our approach with an audit of Stable Diffusion's capacity to imitate 70 professional digital artists with copyrighted work online. When Stable Diffusion is prompted to imitate an artist from this set, we find that the artist can be identified from the imitation with an average accuracy of 81.0%. Finally, we also show that a sample of the artist's work can be matched to these imitation images with a high degree of statistical reliability. Overall, these results suggest that Stable Diffusion is broadly successful at imitating individual human artists.

updated: Sat Jul 08 2023 18:31:25 GMT+0000 (UTC)

published: Sat Jul 08 2023 18:31:25 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト