Font Completion and Manipulation by Cycling Between Multi-Modality Representations

Ye Yuan; Wuyang Chen; Zhaowen Wang; Matthew Fisher; Zhifei Zhang; Zhangyang Wang; Hailin Jin

マルチモダリティ表現間の循環によるフォントの完成と操作

1つまたはいくつかの参照グリフから一貫したスタイルのフォントグリフを生成すること、つまりフォントの完成は、地形設計における重要なタスクです。問題は一般的な画像スタイルの転送タスクよりも明確に定義されているため、ビジョンと機械学習の両方のコミュニティから関心を集めています。既存のアプローチは、画像から画像への直接変換タスクとしてこの問題に対処します。この作業では、フォントスタイルのより本質的なグラフィックプロパティをキャプチャできるように、グラフを中間表現として2Dグラフィックオブジェクトとしてフォントグリフを生成する方法を革新します。具体的には、画像エンコーダーと画像レンダラーの間のグラフコンストラクターを使用して、クロスモダリティサイクルの画像間モデル構造を定式化します。新しいグラフコンストラクターは、グリフの潜在コードを、翻訳タスクを支援するようにトレーニングされた専門知識に一致するグラフ表現にマップします。私たちのモデルは、画像間のベースラインと以前の最先端のグリフ補完方法の両方よりも改善された結果を生成します。さらに、私たちのモデルによって出力されたグラフ表現は、ユーザーがローカルで編集および操作を行うための直感的なインターフェイスも提供します。私たちが提案するクロスモダリティサイクル表現学習は、さまざまなデータモダリティからの事前知識を持つ他のドメインに適用される可能性があります。私たちのコードはhttps://github.com/VITA-Group/Font_Completion_Graphで入手できます。

Generating font glyphs of consistent style from one or a few reference glyphs, i.e., font completion, is an important task in topographical design. As the problem is more well-defined than general image style transfer tasks, thus it has received interest from both vision and machine learning communities. Existing approaches address this problem as a direct image-to-image translation task. In this work, we innovate to explore the generation of font glyphs as 2D graphic objects with the graph as an intermediate representation, so that more intrinsic graphic properties of font styles can be captured. Specifically, we formulate a cross-modality cycled image-to-image model structure with a graph constructor between an image encoder and an image renderer. The novel graph constructor maps a glyph's latent code to its graph representation that matches expert knowledge, which is trained to help the translation task. Our model generates improved results than both image-to-image baseline and previous state-of-the-art methods for glyph completion. Furthermore, the graph representation output by our model also provides an intuitive interface for users to do local editing and manipulation. Our proposed cross-modality cycled representation learning has the potential to be applied to other domains with prior knowledge from different data modalities. Our code is available at https://github.com/VITA-Group/Font_Completion_Graph.

updated: Mon Aug 30 2021 02:43:29 GMT+0000 (UTC)

published: Mon Aug 30 2021 02:43:29 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト