MOGAN: Morphologic-structure-aware Generative Learning from a Single Image

Jinshu Chen; Qihui Xu; Qi Kang; MengChu Zhou

MOGAN：単一の画像からの形態学的構造を意識したジェネレーティブ学習

ほとんどのインタラクティブな画像生成タスクでは、ユーザーが関心領域（ROI）を指定すると、生成された結果は、元の画像の正確で合理的な構造を維持しながら、外観に十分な多様性があると予想されます。限られたデータしか利用できない場合、このようなタスクはより困難になります。最近提案された生成モデルは、1つの画像のみに基づいてトレーニングを完了します。彼らは、サンプル内のさまざまなオブジェクトの実際のセマンティック情報を無視しながら、サンプルのモノリシック機能に多くの注意を払っています。その結果、ROIベースの生成タスクの場合、関連するオブジェクトの正しい構造を維持せずに、過度のランダム性を伴う不適切なサンプルを生成する可能性があります。この問題に対処するために、この作業では、MOGANという名前の形態学的構造を意識した生成的敵対的ネットワークを導入します。MOGANは、1つの画像のみに基づいて、さまざまな外観と信頼できる構造を持つランダムサンプルを生成します。 ROIのトレーニングでは、拡張された元の画像からのデータを利用し、新しいモジュールを導入して、そのような拡張データを構造と外観の両方を含む知識に変換し、モデルのサンプルの理解を高めることを提案します。 ROI以外の残りの領域を学習するために、バイナリマスクを使用して、ROIから分離された世代を確保します。最後に、前述の学習プロセスの並列および階層ブランチを設定します。他の単一画像GANスキームと比較して、私たちのアプローチは、合理的な構造の維持や外観の変化などの内部機能に焦点を当てています。実験により、ROIベースの画像生成タスクでのモデルの容量が競合他社よりも優れていることが確認されました。

In most interactive image generation tasks, given regions of interest (ROI) by users, the generated results are expected to have adequate diversities in appearance while maintaining correct and reasonable structures in original images. Such tasks become more challenging if only limited data is available. Recently proposed generative models complete training based on only one image. They pay much attention to the monolithic feature of the sample while ignoring the actual semantic information of different objects inside the sample. As a result, for ROI-based generation tasks, they may produce inappropriate samples with excessive randomicity and without maintaining the related objects' correct structures. To address this issue, this work introduces a MOrphologic-structure-aware Generative Adversarial Network named MOGAN that produces random samples with diverse appearances and reliable structures based on only one image. For training for ROI, we propose to utilize the data coming from the original image being augmented and bring in a novel module to transform such augmented data into knowledge containing both structures and appearances, thus enhancing the model's comprehension of the sample. To learn the rest areas other than ROI, we employ binary masks to ensure the generation isolated from ROI. Finally, we set parallel and hierarchical branches of the mentioned learning process. Compared with other single image GAN schemes, our approach focuses on internal features including the maintenance of rational structures and variation on appearance. Experiments confirm a better capacity of our model on ROI-based image generation tasks than its competitive peers.

updated: Sun Jul 25 2021 06:54:23 GMT+0000 (UTC)

published: Thu Mar 04 2021 12:45:23 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト