Efficient Mixed Transformer for Single Image Super-Resolution

Ling Zheng; Jinchen Zhu; Jinpeng Shi; Shizhuang Weng

単一画像超解像度のための効率的な混合トランスフォーマー

最近、Transformer ベースの手法は、単一画像超解像度 (SISR) において目覚ましい結果を達成しました。しかし、局所性メカニズムの欠如と高度な複雑さにより、超解像度 (SR) の分野での応用は制限されます。これらの問題を解決するために、本研究では新しい方式である効率混合変圧器 (EMT) を提案します。具体的には、複数の連続するトランス層で構成される混合トランスブロック (MTB) を提案します。その一部では、セルフアテンション (SA) の代わりにピクセルミキサー (PM) が使用されます。 PM は、ピクセルシフト操作を使用してローカル知識の集約を強化できます。同時に、PM にはパラメーターや浮動小数点演算がないため、追加の複雑さは発生しません。さらに、画像の異方性を利用して効率的なグローバル依存関係モデリングを得るために、SA (SWSA) にストライプウィンドウを採用します。実験結果は、EMT がベンチマークデータセットで既存の手法を上回り、最先端のパフォーマンスを達成したことを示しています。コードは https://github で入手できます。 com/Fried-Rice-Lab/EMT.git。

Recently, Transformer-based methods have achieved impressive results in single image super-resolution (SISR). However, the lack of locality mechanism and high complexity limit their application in the field of super-resolution (SR). To solve these problems, we propose a new method, Efficient Mixed Transformer (EMT) in this study. Specifically, we propose the Mixed Transformer Block (MTB), consisting of multiple consecutive transformer layers, in some of which the Pixel Mixer (PM) is used to replace the Self-Attention (SA). PM can enhance the local knowledge aggregation with pixel shifting operations. At the same time, no additional complexity is introduced as PM has no parameters and floating-point operations. Moreover, we employ striped window for SA (SWSA) to gain an efficient global dependency modelling by utilizing image anisotropy. Experimental results show that EMT outperforms the existing methods on benchmark dataset and achieved state-of-the-art performance. The Code is available at https://github. com/Fried-Rice-Lab/EMT.git.

updated: Mon May 22 2023 09:07:22 GMT+0000 (UTC)

published: Fri May 19 2023 03:19:38 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト