Cross Aggregation Transformer for Image Restoration

Zheng Chen; Yulun Zhang; Jinjin Gu; Yongbing Zhang; Linghe Kong; Xin Yuan

画像復元のためのクロス集約トランスフォーマー

最近、Transformer アーキテクチャが画像復元に導入され、畳み込みニューラルネットワーク (CNN) が驚くべき結果で置き換えられました。グローバルな注意を伴う Transformer の高い計算上の複雑さを考慮して、一部の方法では、ローカルの正方形ウィンドウを使用して自己注意の範囲を制限します。ただし、これらの方法では、異なるウィンドウ間での直接的な相互作用が欠けているため、長期的な依存関係の確立が制限されます。上記の問題に対処するために、新しい画像復元モデルである Cross Aggregation Transformer (CAT) を提案します。私たちの CAT のコアは Rectangle-Window Self-Attention (Rwin-SA) です。これは、異なるヘッドで水平および垂直の長方形ウィンドウの注意を並行して利用して、注意領域を拡張し、異なるウィンドウにまたがる機能を集約します。また、さまざまなウィンドウの相互作用に Axial-Shift 操作を導入します。さらに、自己注意メカニズムを補完する Locality Complementary Module を提案します。これは、CNN の帰納的バイアス (たとえば、翻訳の不変性と局所性) を Transformer に組み込み、グローバルとローカルの結合を可能にします。広範な実験により、当社の CAT がいくつかの画像復元アプリケーションで最近の最先端の方法よりも優れていることが実証されています。コードとモデルは https://github.com/zhengchen1999/CAT で入手できます。

Recently, Transformer architecture has been introduced into image restoration to replace convolution neural network (CNN) with surprising results. Considering the high computational complexity of Transformer with global attention, some methods use the local square window to limit the scope of self-attention. However, these methods lack direct interaction among different windows, which limits the establishment of long-range dependencies. To address the above issue, we propose a new image restoration model, Cross Aggregation Transformer (CAT). The core of our CAT is the Rectangle-Window Self-Attention (Rwin-SA), which utilizes horizontal and vertical rectangle window attention in different heads parallelly to expand the attention area and aggregate the features cross different windows. We also introduce the Axial-Shift operation for different window interactions. Furthermore, we propose the Locality Complementary Module to complement the self-attention mechanism, which incorporates the inductive bias of CNN (e.g., translation invariance and locality) into Transformer, enabling global-local coupling. Extensive experiments demonstrate that our CAT outperforms recent state-of-the-art methods on several image restoration applications. The code and models are available at https://github.com/zhengchen1999/CAT.

updated: Thu Mar 23 2023 11:14:35 GMT+0000 (UTC)

published: Thu Nov 24 2022 15:09:33 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト