Learning Generative Structure Prior for Blind Text Image Super-resolution

Xiaoming Li; Wangmeng Zuo; Chen Change Loy

ブラインドテキスト画像超解像のための事前生成構造の学習

ブラインドテキスト画像の超解像 (SR) は、多様なフォントスタイルと未知の劣化に対処する必要があるため、困難です。この問題に対処するために、既存の方法では文字認識を並行して実行し、損失制約または中間特徴条件のいずれかを使用して SR タスクを正則化します。それにもかかわらず、深刻な劣化が発生した場合、高レベルの事前確率は依然として失敗する可能性があります。この問題は、複雑な構造の文字、たとえば、複数の絵文字記号または表意記号を組み合わせて 1 つの文字にする漢字など、さらに複雑になります。この作品では、キャラクターの構造により焦点を当てた小説の事前説明を提示します。特に、豊富で多様な構造を StyleGAN にカプセル化し、復元のためにそのような生成構造の優先順位を利用することを学びます。 StyleGAN の生成空間を制限して、文字の構造に従いながら、さまざまなフォントスタイルを柔軟に処理できるようにするために、各文字の個別の機能をコードブックに保存します。その後、コードは StyleGAN を駆動して、テキスト SR を支援する高解像度の構造詳細を生成します。文字認識に基づく事前確率と比較して、提案された構造事前確率は、指定された文字の忠実で正確なストロークを復元するために、より強力な文字固有のガイダンスを発揮します。合成データセットと実際のデータセットに関する広範な実験は、堅牢なテキスト SR を促進する上で、提案された生成構造の事前の説得力のあるパフォーマンスを示しています。

Blind text image super-resolution (SR) is challenging as one needs to cope with diverse font styles and unknown degradation. To address the problem, existing methods perform character recognition in parallel to regularize the SR task, either through a loss constraint or intermediate feature condition. Nonetheless, the high-level prior could still fail when encountering severe degradation. The problem is further compounded given characters of complex structures, e.g., Chinese characters that combine multiple pictographic or ideographic symbols into a single character. In this work, we present a novel prior that focuses more on the character structure. In particular, we learn to encapsulate rich and diverse structures in a StyleGAN and exploit such generative structure priors for restoration. To restrict the generative space of StyleGAN so that it obeys the structure of characters yet remains flexible in handling different font styles, we store the discrete features for each character in a codebook. The code subsequently drives the StyleGAN to generate high-resolution structural details to aid text SR. Compared to priors based on character recognition, the proposed structure prior exerts stronger character-specific guidance to restore faithful and precise strokes of a designated character. Extensive experiments on synthetic and real datasets demonstrate the compelling performance of the proposed generative structure prior in facilitating robust text SR.

updated: Sun Mar 26 2023 13:54:28 GMT+0000 (UTC)

published: Sun Mar 26 2023 13:54:28 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト