Cascaded Local Implicit Transformer for Arbitrary-Scale Super-Resolution

Hao-Wei Chen; Yu-Syuan Xu; Min-Fong Hong; Yi-Min Tsai; Hsien-Kai Kuo; Chun-Yi Lee

任意スケールの超解像のためのカスケードローカルインプリシットトランスフォーマー

暗黙的なニューラル表現は、最近、任意の解像度で画像を表現する有望な能力を示しました。この論文では、注意メカニズムと周波数エンコーディング技術をローカルの暗黙的なイメージ関数に統合する、ローカルの暗黙的なトランスフォーマー (LIT) を紹介します。ローカル機能を効果的に集約するために、クロススケールのローカルアテンションブロックを設計します。代表力をさらに向上させるために、マルチスケール機能を活用するカスケード LIT (CLIT) と、トレーニング中にアップサンプリングスケールを徐々に増加させる累積トレーニング戦略を提案します。これらのコンポーネントの有効性を検証し、さまざまなトレーニング戦略を分析するために、広範な実験を実施しました。定性的および定量的な結果は、LIT と CLIT が好ましい結果を達成し、任意の超解像タスクで以前の作業よりも優れていることを示しています。

Implicit neural representation has recently shown a promising ability in representing images with arbitrary resolutions. In this paper, we present a Local Implicit Transformer (LIT), which integrates the attention mechanism and frequency encoding technique into a local implicit image function. We design a cross-scale local attention block to effectively aggregate local features. To further improve representative power, we propose a Cascaded LIT (CLIT) that exploits multi-scale features, along with a cumulative training strategy that gradually increases the upsampling scales during training. We have conducted extensive experiments to validate the effectiveness of these components and analyze various training strategies. The qualitative and quantitative results demonstrate that LIT and CLIT achieve favorable results and outperform the prior works in arbitrary super-resolution tasks.

updated: Wed Mar 29 2023 07:41:56 GMT+0000 (UTC)

published: Wed Mar 29 2023 07:41:56 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト