Enriched CNN-Transformer Feature Aggregation Networks for Super-Resolution

Jinsu Yoo; Taehoon Kim; Sihaeng Lee; Seung Hwan Kim; Honglak Lee; Tae Hyun Kim

超解像のための強化された CNN-Transformer 機能集約ネットワーク

最近の変換器ベースの超解像 (SR) 法は、従来の CNN ベースの方法に対して有望な結果を達成しています。ただし、これらのアプローチは、標準的な自己注意に基づく推論のみを利用することによって作成される本質的な近視眼に苦しんでいます。このホワイトペーパーでは、効果的なハイブリッド SR ネットワークを導入して、CNN からのローカル機能やトランスフォーマーによってキャプチャされた長距離マルチスケール依存関係など、強化された機能を集約します。具体的には、私たちのネットワークは、復元手順中に各表現を相乗的に補完する変換ブランチと畳み込みブランチで構成されています。さらに、クロススケールトークンアテンションモジュールを提案し、トランスフォーマーブランチが異なるスケール間でトークン間の有益な関係を効率的に活用できるようにします。提案された方法は、多数のベンチマークデータセットで最先端の SR 結果を達成します。

Recent transformer-based super-resolution (SR) methods have achieved promising results against conventional CNN-based methods. However, these approaches suffer from essential shortsightedness created by only utilizing the standard self-attention-based reasoning. In this paper, we introduce an effective hybrid SR network to aggregate enriched features, including local features from CNNs and long-range multi-scale dependencies captured by transformers. Specifically, our network comprises transformer and convolutional branches, which synergetically complement each representation during the restoration procedure. Furthermore, we propose a cross-scale token attention module, allowing the transformer branch to exploit the informative relationships among tokens across different scales efficiently. Our proposed method achieves state-of-the-art SR results on numerous benchmark datasets.

updated: Thu Oct 20 2022 06:29:16 GMT+0000 (UTC)

published: Tue Mar 15 2022 06:52:25 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト