Stable Attribute Group Editing for Reliable Few-shot Image Generation

Guanqi Ding; Xinzhe Han; Shuhui Wang; Xin Jin; Dandan Tu; Qingming Huang

信頼性の高い少数ショット画像生成のための安定した属性グループ編集

少数ショット画像生成は、少数のサンプルのみに基づいて、目に見えないカテゴリのデータを生成することを目的としています。基本的なコンテンツ生成とは別に、少量データ検出や少数ショット分類など、一連のダウンストリームアプリケーションがこのタスクの恩恵を受けることが期待されます。この目標を達成するために、生成された画像は、視覚的な品質と多様性を超えて分類するためのカテゴリ保持を保証する必要があります。私たちの予備研究では、信頼性の高い少数ショット画像生成のための「編集ベース」フレームワーク属性グループ編集 (AGE) を提示します。これにより、生成パフォーマンスが大幅に向上します。それにもかかわらず、ダウンストリームの分類における AGE のパフォーマンスは、期待したほど満足のいくものではありません。この論文では、クラスの不一致の問題を調査し、より安定したクラス関連の画像生成のための安定属性グループ編集 (SAGE) を提案します。 SAGE は与えられたすべての少数ショット画像を利用し、カテゴリ関連の属性辞書に基づいてクラスセンターの埋め込みを推定します。一方、カテゴリ関連属性辞書の射影重みに従って、類似したカテゴリからカテゴリ非関連属性を選択できます。その結果、SAGE は新しいクラスの分布全体を StyleGAN の潜在空間に注入するため、生成された画像のカテゴリの保持と安定性が大幅に維持されます。さらに一歩進んで、クラスの不一致は、下流の分類のために GAN で生成された画像でよくある問題であることがわかりました。生成された画像は写真のようにリアルに見え、カテゴリ関連の編集は必要ありませんが、通常、下流の分類にはあまり役に立ちません。生成モデルと分類モデルの両方の観点からこの問題を体系的に議論し、ピクセルと周波数成分を強化することにより、SAGE のダウンストリーム分類パフォーマンスを向上させることを提案します。

Few-shot image generation aims to generate data of an unseen category based on only a few samples. Apart from basic content generation, a bunch of downstream applications hopefully benefit from this task, such as low-data detection and few-shot classification. To achieve this goal, the generated images should guarantee category retention for classification beyond the visual quality and diversity. In our preliminary work, we present an ``editing-based'' framework Attribute Group Editing (AGE) for reliable few-shot image generation, which largely improves the generation performance. Nevertheless, AGE's performance on downstream classification is not as satisfactory as expected. This paper investigates the class inconsistency problem and proposes Stable Attribute Group Editing (SAGE) for more stable class-relevant image generation. SAGE takes use of all given few-shot images and estimates a class center embedding based on the category-relevant attribute dictionary. Meanwhile, according to the projection weights on the category-relevant attribute dictionary, we can select category-irrelevant attributes from the similar seen categories. Consequently, SAGE injects the whole distribution of the novel class into StyleGAN's latent space, thus largely remains the category retention and stability of the generated images. Going one step further, we find that class inconsistency is a common problem in GAN-generated images for downstream classification. Even though the generated images look photo-realistic and requires no category-relevant editing, they are usually of limited help for downstream classification. We systematically discuss this issue from both the generative model and classification model perspectives, and propose to boost the downstream classification performance of SAGE by enhancing the pixel and frequency components.

updated: Wed Feb 01 2023 01:51:47 GMT+0000 (UTC)

published: Wed Feb 01 2023 01:51:47 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト