Fine-grained Image Editing by Pixel-wise Guidance Using Diffusion Models

Naoki Matsunaga; Masato Ishii; Akio Hayakawa; Kenji Suzuki; Takuya Narihira

拡散モデルを使用したピクセル単位のガイダンスによるきめ細かい画像編集

私たちの目標は、実世界のアプリケーションに適したきめの細かい実画像編集方法を開発することです。この論文では、まずこれらの方法に対する 4 つの要件を要約し、これらの要件を満たすピクセル単位のガイダンスを備えた新しい拡散ベースの画像編集フレームワークを提案します。具体的には、いくつかの注釈付きデータを使用してピクセル分類子をトレーニングし、ターゲット画像のセグメンテーションマップを推測します。次にユーザーはマップを操作して、画像をどのように編集するかを指示します。事前にトレーニングされた拡散モデルを利用して、ピクセル単位のガイダンスでユーザーの意図に合わせた編集画像を生成します。提案されたガイダンスと他の技術を効果的に組み合わせることで、編集領域の外側を維持しながら高度に制御可能な編集が可能になり、結果として要件を満たします。実験結果は、編集の品質と速度の点で、私たちの提案が GAN ベースの方法よりも優れていることを示しています。

Our goal is to develop fine-grained real-image editing methods suitable for real-world applications. In this paper, we first summarize four requirements for these methods and propose a novel diffusion-based image editing framework with pixel-wise guidance that satisfies these requirements. Specifically, we train pixel-classifiers with a few annotated data and then infer the segmentation map of a target image. Users then manipulate the map to instruct how the image will be edited. We utilize a pre-trained diffusion model to generate edited images aligned with the user's intention with pixel-wise guidance. The effective combination of proposed guidance and other techniques enables highly controllable editing with preserving the outside of the edited area, which results in meeting our requirements. The experimental results demonstrate that our proposal outperforms the GAN-based method for editing quality and speed.

updated: Wed May 31 2023 06:34:32 GMT+0000 (UTC)

published: Mon Dec 05 2022 04:39:08 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト