Uformer: A General U-Shaped Transformer for Image Restoration

Zhendong Wang; Xiaodong Cun; Jianmin Bao; Wengang Zhou; Jianzhuang Liu; Houqiang Li

Uformer：画像復元用の一般的なU字型トランスフォーマー

このホワイトペーパーでは、画像復元のための効果的かつ効率的なTransformerベースのアーキテクチャであるUformerを紹介します。このアーキテクチャでは、Transformerブロックを使用して階層型エンコーダ-デコーダネットワークを構築します。 Uformerには、2つのコアデザインがあります。最初に、グローバルな自己注意の代わりに、重複しないウィンドウベースの自己注意を実行する、新しいローカル拡張ウィンドウ（LeWin）Transformerブロックを紹介します。ローカルコンテキストをキャプチャしながら、高解像度の特徴マップでの計算の複雑さを大幅に軽減します。次に、Uformerデコーダーの複数のレイヤーの機能を調整するために、マルチスケール空間バイアスの形式で学習可能なマルチスケール復元変調器を提案します。私たちの変調器は、わずかな追加パラメータと計算コストを導入しながら、さまざまな画像復元タスクの詳細を復元するための優れた機能を示しています。これら2つの設計を搭載したUformerは、画像の復元のためにローカルとグローバルの両方の依存関係をキャプチャする高い機能を備えています。私たちのアプローチを評価するために、画像のノイズ除去、動きのぼけ除去、焦点ぼけのぼけ除去、およびドレインを含む、いくつかの画像復元タスクで広範な実験が行われます。ベルやホイッスルがないため、Uformerは、最先端のアルゴリズムと比較して、優れた、または同等のパフォーマンスを実現します。コードとモデルはhttps://github.com/ZhendongWang6/Uformerで入手できます。

In this paper, we present Uformer, an effective and efficient Transformer-based architecture for image restoration, in which we build a hierarchical encoder-decoder network using the Transformer block. In Uformer, there are two core designs. First, we introduce a novel locally-enhanced window (LeWin) Transformer block, which performs nonoverlapping window-based self-attention instead of global self-attention. It significantly reduces the computational complexity on high resolution feature map while capturing local context. Second, we propose a learnable multi-scale restoration modulator in the form of a multi-scale spatial bias to adjust features in multiple layers of the Uformer decoder. Our modulator demonstrates superior capability for restoring details for various image restoration tasks while introducing marginal extra parameters and computational cost. Powered by these two designs, Uformer enjoys a high capability for capturing both local and global dependencies for image restoration. To evaluate our approach, extensive experiments are conducted on several image restoration tasks, including image denoising, motion deblurring, defocus deblurring and deraining. Without bells and whistles, our Uformer achieves superior or comparable performance compared with the state-of-the-art algorithms. The code and models are available at https://github.com/ZhendongWang6/Uformer.

updated: Thu Nov 25 2021 10:19:05 GMT+0000 (UTC)

published: Sun Jun 06 2021 12:33:22 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト