DoodleFormer: Creative Sketch Drawing with Transformers

Ankan Kumar Bhunia; Salman Khan; Hisham Cholakkal; Rao Muhammad Anwer; Fahad Shahbaz Khan; Jorma Laaksonen; Michael Felsberg

DoodleFormer：トランスフォーマーを使用したクリエイティブなスケッチ描画

創造的なスケッチや落書きは表現力豊かな活動であり、日常の視覚的オブジェクトの想像力に富んだ、かつては見られなかった描写が描かれます。創造的なスケッチ画像の生成は、視覚世界のオブジェクトの目に見えない構成を備えた、多様でありながら現実的な創造的なスケッチを生成することが課題である、挑戦的な視覚の問題です。ここでは、創造的なスケッチ生成の問題を粗いスケッチ構成の作成に分解し、続いてスケッチに細かい詳細を組み込む、新しい粗い2段階のフレームワークであるDoodleFormerを提案します。さまざまな身体部分間のグローバルな動的およびローカルな静的構造関係を効果的にキャプチャするグラフ対応トランスフォーマーエンコーダーを紹介します。生成されたクリエイティブスケッチの多様性を確保するために、描画される各スケッチボディパーツのバリエーションを明示的にモデル化する確率的粗スケッチデコーダーを導入します。実験は、CreativeBirdsとCreativeCreaturesの2つのクリエイティブスケッチデータセットで実行されます。私たちの定性的、定量的、および人間ベースの評価は、DoodleFormerが両方のデータセットで最先端を上回り、現実的で多様な創造的なスケッチを生み出すことを示しています。クリエイティブクリーチャーでは、DoodleFormerは、最先端のFr`echet開始距離（FID）に関して25の絶対ゲインを達成します。また、クリエイティブなスケッチの生成とスケッチの完成に関連するテキストのアプリケーションに対するDoodleFormerの有効性を示します。

Creative sketching or doodling is an expressive activity, where imaginative and previously unseen depictions of everyday visual objects are drawn. Creative sketch image generation is a challenging vision problem, where the task is to generate diverse, yet realistic creative sketches possessing the unseen composition of the visual-world objects. Here, we propose a novel coarse-to-fine two-stage framework, DoodleFormer, that decomposes the creative sketch generation problem into the creation of coarse sketch composition followed by the incorporation of fine-details in the sketch. We introduce graph-aware transformer encoders that effectively capture global dynamic as well as local static structural relations among different body parts. To ensure diversity of the generated creative sketches, we introduce a probabilistic coarse sketch decoder that explicitly models the variations of each sketch body part to be drawn. Experiments are performed on two creative sketch datasets: Creative Birds and Creative Creatures. Our qualitative, quantitative and human-based evaluations show that DoodleFormer outperforms the state-of-the-art on both datasets, yielding realistic and diverse creative sketches. On Creative Creatures, DoodleFormer achieves an absolute gain of 25 in terms of Fr`echet inception distance (FID) over the state-of-the-art. We also demonstrate the effectiveness of DoodleFormer for related applications of text to creative sketch generation and sketch completion.

updated: Sat Jul 09 2022 06:21:04 GMT+0000 (UTC)

published: Mon Dec 06 2021 18:59:59 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト