Learning GAN-based Foveated Reconstruction to Recover Perceptually Important Image Features

Luca Surace; Marek Wernikowski; Cara Tursun; Karol Myszkowski; Radosław Mantiuk; Piotr Didyk

知覚的に重要な画像の特徴を回復するための GAN ベースの中心窩再構築の学習

中心窩画像は、人間の視覚系の網膜感度に従って分散されたまばらなサンプルセットから完全に再構築できます。この感度は、偏心の増加とともに急速に低下します。 Generative Adversarial Networks の使用は、不足している画像情報をうまく幻覚させることができるため、このようなタスクの有望なソリューションであることが最近示されました。他の教師あり学習アプローチの場合と同様に、損失関数の定義とトレーニング戦略が出力の品質に大きく影響します。この作業では、中心窩再構築技術のトレーニングを効率的にガイドして、人間の視覚系の能力と限界をより認識し、視覚的に重要な画像の特徴を再構築できるようにする問題を検討します。私たちの主な目標は、トレーニング手順を人間が検出できない歪みの影響を受けにくくし、知覚的に重要なアーティファクトにペナルティを課すことに集中することです。 GAN ベースのソリューションの性質を考慮して、入力サンプルの密度が異なる場合の幻覚に対する人間の視覚の感度に焦点を当てます。心理物理実験、データセット、および中心窩画像再構成をトレーニングするための手順を提案します。提案された戦略は、出力の知覚的に重要な偏差のみにペナルティを課すことにより、ジェネレーターネットワークを柔軟にします。その結果、この方法は、知覚的に重要な画像の特徴の回復を強調しました。私たちは戦略を評価し、新たに訓練された客観的指標、最近中心的なビデオ品質指標、およびユーザー実験を使用して、代替ソリューションと比較しました。私たちの評価では、標準のGANベースのトレーニングアプローチと比較して、知覚される画像再構成の品質が大幅に改善されていることが明らかになりました.

A foveated image can be entirely reconstructed from a sparse set of samples distributed according to the retinal sensitivity of the human visual system, which rapidly decreases with increasing eccentricity. The use of Generative Adversarial Networks has recently been shown to be a promising solution for such a task, as they can successfully hallucinate missing image information. As in the case of other supervised learning approaches, the definition of the loss function and the training strategy heavily influence the quality of the output. In this work,we consider the problem of efficiently guiding the training of foveated reconstruction techniques such that they are more aware of the capabilities and limitations of the human visual system, and thus can reconstruct visually important image features. Our primary goal is to make the training procedure less sensitive to distortions that humans cannot detect and focus on penalizing perceptually important artifacts. Given the nature of GAN-based solutions, we focus on the sensitivity of human vision to hallucination in case of input samples with different densities. We propose psychophysical experiments, a dataset, and a procedure for training foveated image reconstruction. The proposed strategy renders the generator network flexible by penalizing only perceptually important deviations in the output. As a result, the method emphasized the recovery of perceptually important image features. We evaluated our strategy and compared it with alternative solutions by using a newly trained objective metric, a recent foveated video quality metric, and user experiments. Our evaluations revealed significant improvements in the perceived image reconstruction quality compared with the standard GAN-based training approach.

updated: Mon Apr 17 2023 16:42:28 GMT+0000 (UTC)

published: Sat Aug 07 2021 18:39:49 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト