DiffIR: Efficient Diffusion Model for Image Restoration

Bin Xia; Yulun Zhang; Shiyin Wang; Yitong Wang; Xinglong Wu; Yapeng Tian; Wenming Yang; Luc Van Gool

DiffIR: 画像復元のための効率的な拡散モデル

拡散モデル (DM) は、画像合成プロセスをノイズ除去ネットワークの逐次アプリケーションにモデル化することで SOTA パフォーマンスを実現しました。ただし、画像合成とは異なり、画像復元 (IR) には、グラウンドトゥルースに従って結果を生成するという強い制約があります。したがって、IR の場合、画像全体や特徴マップを推定するために大規模なモデルに対して大規模な反復を実行する従来の DM は非効率的です。この問題に対処するために、コンパクトな IR 事前抽出ネットワーク (CPEN)、動的 IR トランスフォーマー (DIRformer)、およびノイズ除去ネットワークで構成される IR 用の効率的な DM (DiffIR) を提案します。具体的には、DiffIR には、事前トレーニングとトレーニング DM という 2 つのトレーニング段階があります。事前トレーニングでは、グラウンドトゥルース画像を CPEN_S1 に入力して、DIRformer をガイドするコンパクトな IR 事前表現 (IPR) をキャプチャします。第 2 段階では、LQ 画像のみを使用して、事前トレーニングされた CPEN_S1 と同じ IRP を直接推定するように DM をトレーニングします。 IPR は単なるコンパクトなベクトルであるため、DiffIR は従来の DM より少ない反復回数で正確な推定値を取得し、より安定した現実的な結果を生成できることがわかります。反復が少ないため、DiffIR は CPEN_S2、DIRformer、およびノイズ除去ネットワークの共同最適化を採用でき、これにより推定誤差の影響をさらに低減できます。私たちはいくつかの IR タスクについて広範な実験を実施し、計算コストを削減しながら SOTA パフォーマンスを達成します。コードは https://github.com/Zj-BinXia/DiffIR で入手できます。

Diffusion model (DM) has achieved SOTA performance by modeling the image synthesis process into a sequential application of a denoising network. However, different from image synthesis, image restoration (IR) has a strong constraint to generate results in accordance with ground-truth. Thus, for IR, traditional DMs running massive iterations on a large model to estimate whole images or feature maps is inefficient. To address this issue, we propose an efficient DM for IR (DiffIR), which consists of a compact IR prior extraction network (CPEN), dynamic IR transformer (DIRformer), and denoising network. Specifically, DiffIR has two training stages: pretraining and training DM. In pretraining, we input ground-truth images into CPEN_S1 to capture a compact IR prior representation (IPR) to guide DIRformer. In the second stage, we train the DM to directly estimate the same IRP as pretrained CPEN_S1 only using LQ images. We observe that since the IPR is only a compact vector, DiffIR can use fewer iterations than traditional DM to obtain accurate estimations and generate more stable and realistic results. Since the iterations are few, our DiffIR can adopt a joint optimization of CPEN_S2, DIRformer, and denoising network, which can further reduce the estimation error influence. We conduct extensive experiments on several IR tasks and achieve SOTA performance while consuming less computational costs. Code is available at https://github.com/Zj-BinXia/DiffIR.

updated: Wed Aug 16 2023 14:36:41 GMT+0000 (UTC)

published: Thu Mar 16 2023 16:47:14 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト