Cross-Resolution Flow Propagation for Foveated Video Super-Resolution

Eugene Lee; Lien-Feng Hsu; Evan Chen; Chen-Yi Lee

中心窩ビデオ超解像のためのクロス解像度フロー伝搬

高解像度のビデオコンテンツの需要は年々増加しています。ただし、高解像度ビデオの配信は、レンダリングに必要な計算リソースまたはリモート伝送用のネットワーク帯域幅のいずれかによって制限されます。この制限を改善するために、既存の拡張現実と仮想現実のヘッドセットにあるアイトラッカーを活用します。ビデオ超解像 (VSR) 技術を適用して、低解像度のコンテキストと地域の高解像度のコンテキストを融合し、品質を著しく低下させることなく高解像度のコンテンツをリソースに制約された状態で消費することを提案します。アイトラッカーはユーザーの視線方向を提供し、地域の高解像度コンテキストの抽出に役立ちます。注視領域内にあるピクセルのみが人間の目で解像できるため、観測領域を超えた領域の品質の違いを認識できないため、配信されたコンテンツの多くは冗長です。高解像度領域と低解像度領域の融合から視覚的に心地よいフレームを生成するために、観測された領域のコンテキストを現在および将来の他の領域 (低解像度) に転送するディープニューラルネットワークの機能を研究します。フレーム。凝視領域からのピクセルの融合により、現在および将来のフレームの低解像度領域を超解像する必要があるため、このタスクを Foveated Video Super-Resolution (FVSR) と呼びます。 FVSR の Cross-Resolution Flow Propagation (CRFP) を提案します。 8x FVSR、つまり 8x VSR と中心窩領域の融合の組み合わせで、REDS データセットの CRFP をトレーニングして評価します。 SSIM または PSNR を使用したフレームごとの品質の従来の評価から出発して、過去の中心窩領域の評価を提案し、FVSR 中にアイトラッカーに存在するノイズを活用するモデルの能力を測定します。コードは https://github.com/eugenelet/CRFP で入手できます。

The demand of high-resolution video contents has grown over the years. However, the delivery of high-resolution video is constrained by either computational resources required for rendering or network bandwidth for remote transmission. To remedy this limitation, we leverage the eye trackers found alongside existing augmented and virtual reality headsets. We propose the application of video super-resolution (VSR) technique to fuse low-resolution context with regional high-resolution context for resource-constrained consumption of high-resolution content without perceivable drop in quality. Eye trackers provide us the gaze direction of a user, aiding us in the extraction of the regional high-resolution context. As only pixels that falls within the gaze region can be resolved by the human eye, a large amount of the delivered content is redundant as we can't perceive the difference in quality of the region beyond the observed region. To generate a visually pleasing frame from the fusion of high-resolution region and low-resolution region, we study the capability of a deep neural network of transferring the context of the observed region to other regions (low-resolution) of the current and future frames. We label this task a Foveated Video Super-Resolution (FVSR), as we need to super-resolve the low-resolution regions of current and future frames through the fusion of pixels from the gaze region. We propose Cross-Resolution Flow Propagation (CRFP) for FVSR. We train and evaluate CRFP on REDS dataset on the task of 8x FVSR, i.e. a combination of 8x VSR and the fusion of foveated region. Departing from the conventional evaluation of per frame quality using SSIM or PSNR, we propose the evaluation of past foveated region, measuring the capability of a model to leverage the noise present in eye trackers during FVSR. Code is made available at https://github.com/eugenelet/CRFP.

updated: Tue Dec 27 2022 15:38:38 GMT+0000 (UTC)

published: Tue Dec 27 2022 15:38:38 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト