FontTransformer: Few-shot High-resolution Chinese Glyph Image Synthesis via Stacked Transformers

Yitian Liu; Zhouhui Lian

FontTransformer: スタックトランスフォーマーによる少数ショットの高解像度中国語グリフ画像合成

いくつかのオンライントレーニングサンプルから高品質の中国語フォントを自動生成することは、特にサンプルの量が非常に少ない場合は困難な作業です。既存の少数ショットフォント生成方法では、しばしば不適切なトポロジ構造や不完全なストロークを含む低解像度のグリフイメージしか合成できません。この問題に対処するために、この論文では、スタックされたトランスフォーマーを使用した高解像度の中国語グリフ画像合成のための新しい少数ショット学習モデルである FontTransformer を提案します。重要なアイデアは、パラレル Transformer を適用して予測エラーの蓄積を回避し、シリアル Transformer を利用して合成ストロークの品質を向上させることです。一方で、より多くのグリフ情報と事前知識をモデルに供給するための新しいエンコーディングスキームも設計します。これにより、高解像度で視覚的に魅力的なグリフ画像の生成がさらに可能になります。定性的および定量的な実験結果の両方が、数ショットの中国語フォント合成タスクにおける他の既存のアプローチと比較して、私たちの方法の優位性を示しています。

Automatic generation of high-quality Chinese fonts from a few online training samples is a challenging task, especially when the amount of samples is very small. Existing few-shot font generation methods can only synthesize low-resolution glyph images that often possess incorrect topological structures or/and incomplete strokes. To address the problem, this paper proposes FontTransformer, a novel few-shot learning model, for high-resolution Chinese glyph image synthesis by using stacked Transformers. The key idea is to apply the parallel Transformer to avoid the accumulation of prediction errors and utilize the serial Transformer to enhance the quality of synthesized strokes. Meanwhile, we also design a novel encoding scheme to feed more glyph information and prior knowledge to our model, which further enables the generation of high-resolution and visually-pleasing glyph images. Both qualitative and quantitative experimental results demonstrate the superiority of our method compared to other existing approaches in the few-shot Chinese font synthesis task.

updated: Thu Oct 13 2022 02:53:19 GMT+0000 (UTC)

published: Wed Oct 12 2022 15:09:22 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト