EffLiFe: Efficient Light Field Generation via Hierarchical Sparse Gradient Descent

Yijie Deng; Lei Han; Tianpeng Lin; Lin Li; Jinzhi Zhang; Lu Fang

EffLiFe: 階層的疎勾配降下法による効率的なライトフィールド生成

拡張現実 (XR) テクノロジーの台頭により、まばらなビュー入力からのリアルタイムライトフィールド生成のニーズが高まっています。既存の手法は、高品質の新規ビューを生成できるものの長い推論/トレーニング時間がかかるオフライン手法と、一般化性に欠けるか満足のいく結果が得られないオンライン手法に分類できます。ただし、マルチプレーンイメージ (MPI) の固有の疎多様体により、レンダリング品質を維持しながらライトフィールド生成を大幅に加速できることがわかりました。この洞察に基づいて、提案された階層的疎勾配降下法 (HSGD) を活用して、疎なビュー画像から高品質のライトフィールドをリアルタイムで生成する、新しいライトフィールド最適化手法である EffLiFe を紹介します。技術的には、シーンの粗い MPI は最初に 3D CNN を使用して生成され、数回の反復で重要な MPI 勾配のみに焦点を当てることによってさらにまばらに最適化されます。それにもかかわらず、最適化のみに依存すると、オクルージョン境界でアーチファクトが発生する可能性があります。したがって、入力を反復的にフィルタリングすることによって、オクルージョンされた領域内の視覚的アーティファクトを除去する、オクルージョンを認識した反復的リファインメントモジュールを提案します。広範な実験により、私たちの方法は、最先端のオフライン方法よりも平均して 100 倍高速でありながら、同等の視覚品質を達成し、他のオンラインアプローチと比較して優れたパフォーマンス (PSNR で約 2 dB 高い) を実現できることが実証されました。

With the rise of Extended Reality (XR) technology, there is a growing need for real-time light field generation from sparse view inputs. Existing methods can be classified into offline techniques, which can generate high-quality novel views but at the cost of long inference/training time, and online methods, which either lack generalizability or produce unsatisfactory results. However, we have observed that the intrinsic sparse manifold of Multi-plane Images (MPI) enables a significant acceleration of light field generation while maintaining rendering quality. Based on this insight, we introduce EffLiFe, a novel light field optimization method, which leverages the proposed Hierarchical Sparse Gradient Descent (HSGD) to produce high-quality light fields from sparse view images in real time. Technically, the coarse MPI of a scene is first generated using a 3D CNN, and it is further sparsely optimized by focusing only on important MPI gradients in a few iterations. Nevertheless, relying solely on optimization can lead to artifacts at occlusion boundaries. Therefore, we propose an occlusion-aware iterative refinement module that removes visual artifacts in occluded regions by iteratively filtering the input. Extensive experiments demonstrate that our method achieves comparable visual quality while being 100x faster on average than state-of-the-art offline methods and delivering better performance (about 2 dB higher in PSNR) compared to other online approaches.

updated: Thu Jul 06 2023 14:31:01 GMT+0000 (UTC)

published: Thu Jul 06 2023 14:31:01 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト