Frequency Separation for Real-World Super-Resolution

Manuel Fritsche; Shuhang Gu; Radu Timofte

実世界の超解像のための周波数分離

画像の超解像（SR）に関する最近の文献のほとんどは、低解像度（LR）画像と高解像度（HR）画像のペアの形式のトレーニングデータの可用性、またはダウングレード演算子（通常はバイキュービックダウンスケーリング）の知識を前提としています。提案された方法は標準のベンチマークではうまく機能しますが、実際の環境では説得力のある結果を出すことができません。これは、実世界の画像がセンサーノイズなどの破損の影響を受ける可能性があるためです。破損はバイキュービックダウンスケーリングによって大幅に変更されます。そのため、トレーニング中にモデルが実際の画像を見ることがないため、一般化機能が制限されます。また、同じソースドメインでLRとHRのペア画像を収集するのは面倒です。この問題に対処するために、DSGANを提案して、バイキュービックに縮小された画像に自然な画像特性を導入します。 HR画像に対して教師なしの方法でトレーニングすることができるため、元の画像と同じ特性を持つLR画像を生成できます。次に、生成されたデータを使用してSRモデルをトレーニングします。これにより、実際の画像でのパフォーマンスが大幅に向上します。さらに、低画像周波数と高画像周波数を分離し、トレーニング中に異なる方法で処理することを提案します。低周波数はダウンサンプリング操作によって保存されるため、高周波数を変更するには敵対的なトレーニングのみが必要です。この考え方は、SRモデルだけでなくDSGANモデルにも適用されます。定量的および定性的分析を通じて、いくつかの実験で本手法の有効性を実証します。当社のソリューションは、ICCV 2019でのReal World SRでのAIMチャレンジの勝者です。

Most of the recent literature on image super-resolution (SR) assumes the availability of training data in the form of paired low resolution (LR) and high resolution (HR) images or the knowledge of the downgrading operator (usually bicubic downscaling). While the proposed methods perform well on standard benchmarks, they often fail to produce convincing results in real-world settings. This is because real-world images can be subject to corruptions such as sensor noise, which are severely altered by bicubic downscaling. Therefore, the models never see a real-world image during training, which limits their generalization capabilities. Moreover, it is cumbersome to collect paired LR and HR images in the same source domain. To address this problem, we propose DSGAN to introduce natural image characteristics in bicubically downscaled images. It can be trained in an unsupervised fashion on HR images, thereby generating LR images with the same characteristics as the original images. We then use the generated data to train a SR model, which greatly improves its performance on real-world images. Furthermore, we propose to separate the low and high image frequencies and treat them differently during training. Since the low frequencies are preserved by downsampling operations, we only require adversarial training to modify the high frequencies. This idea is applied to our DSGAN model as well as the SR model. We demonstrate the effectiveness of our method in several experiments through quantitative and qualitative analysis. Our solution is the winner of the AIM Challenge on Real World SR at ICCV 2019.

updated: Mon Nov 18 2019 17:08:28 GMT+0000 (UTC)

published: Mon Nov 18 2019 17:08:28 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト