Fast and High-Quality Image Denoising via Malleable Convolutions

Yifan Jiang; Bart Wronski; Ben Mildenhall; Jon Barron; Zhangyang Wang; Tianfan Xue

可鍛性畳み込みによる高速で高品質の画像ノイズ除去

多くの画像処理ネットワークは、入力画像全体に静的な畳み込みカーネルの単一のセットを適用します。これは、不均一な視覚パターンで構成されることが多いため、自然画像には最適ではありません。分類、セグメンテーション、および画像復元の最近の研究は、ローカル画像統計のモデリングにおいて、動的カーネルが静的カーネルよりも優れていることを示しています。ただし、これらの作業では、ピクセルごとの畳み込みカーネルが採用されることが多く、メモリと計算のコストが高くなります。大きなオーバーヘッドなしに空間的に変化する処理を実現するために、動的畳み込みの効率的な変形として、可鍛性畳み込み（MalleConv）を紹介します。 MalleConvの重みは、特定の空間位置でコンテンツに依存する出力を生成できる効率的な予測ネットワークによって動的に生成されます。以前の作品とは異なり、MalleConvは入力から空間的に変化するカーネルのはるかに小さなセットを生成します。これにより、ネットワークの受容野が拡大し、計算コストとメモリコストが大幅に削減されます。次に、これらのカーネルは、最小のメモリオーバーヘッドで効率的なスライスアンドコンバージョン演算子を使用して、フル解像度の機能マップに適用されます。さらに、MalleNetと呼ばれるMalleConvを使用して、効率的なノイズ除去ネットワークを構築します。同様のパフォーマンスを維持しながら、非常に深いアーキテクチャなしで高品質の結果を実現します。たとえば、最高のパフォーマンスを発揮するノイズ除去アルゴリズム（SwinIR）と比較して8.91倍の速度に到達します。また、標準の畳み込みベースのバックボーンに追加された単一のMalleConvが、同様のコストで計算コストの削減または画質の向上に大きく貢献できることも示しています。プロジェクトページ：https：//yifanjiang.net/MalleConv.html

Many image processing networks apply a single set of static convolutional kernels across the entire input image, which is sub-optimal for natural images, as they often consist of heterogeneous visual patterns. Recent works in classification, segmentation, and image restoration have demonstrated that dynamic kernels outperform static kernels at modeling local image statistics. However, these works often adopt per-pixel convolution kernels, which introduce high memory and computation costs. To achieve spatial-varying processing without significant overhead, we present Malleable Convolution (MalleConv), as an efficient variant of dynamic convolution. The weights of MalleConv are dynamically produced by an efficient predictor network capable of generating content-dependent outputs at specific spatial locations. Unlike previous works, MalleConv generates a much smaller set of spatially-varying kernels from input, which enlarges the network's receptive field and significantly reduces computational and memory costs. These kernels are then applied to a full-resolution feature map through an efficient slice-and-conv operator with minimum memory overhead. We further build an efficient denoising network using MalleConv, coined as MalleNet. It achieves high quality results without very deep architecture, e.g., reaching 8.91x faster speed compared to the best performed denoising algorithms (SwinIR), while maintaining similar performance. We also show that a single MalleConv added to a standard convolution-based backbone can contribute significantly to reducing the computational cost or boosting image quality at a similar cost. Project page: https://yifanjiang.net/MalleConv.html

updated: Tue Jan 04 2022 04:05:01 GMT+0000 (UTC)

published: Sun Jan 02 2022 18:35:20 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト