DäRF: Boosting Radiance Fields from Sparse Inputs with Monocular Depth Adaptation

Jiuhn Song; Seonghoon Park; Honggyu An; Seokju Cho; Min-Seop Kwak; Sungjin Cho; Seungryong Kim

DäRF: 単眼深度適応によるまばらな入力からの放射輝度フィールドのブースト

Neural Radiance Field (NeRF) は、新しいビューの合成と 3D ジオメトリの再構成において強力なパフォーマンスを示しますが、既知の視点の数が大幅に減少すると、重大なパフォーマンスの低下に悩まされます。既存の研究では、外部事前分布を利用することでこの問題を克服しようとしていますが、その成功は特定の種類のシーンまたはデータセットに限定されています。大規模な RGB-D データセットで事前トレーニングされ、強力な汎化機能を備えた単眼深度推定 (MDE) ネットワークを採用することが、この問題を解決する鍵となります。ただし、MDE を NeRF と組み合わせて使用すると、さまざまな理由による新たな一連の課題が生じます。単眼の奥行きによって現れる曖昧さの問題。この観点から、オンライン補完トレーニングを通じて NeRF と単眼深度推定の長所を組み合わせることで、少数の実世界の画像を使用して堅牢な NeRF 再構成を実現する、DäRF と呼ばれる新しいフレームワークを提案します。私たちのフレームワークは、堅牢性と一貫性を強化するために、目に見える視点と見えない視点の両方で NeRF 表現の前に MDE ネットワークの強力なジオメトリを強制します。さらに、パッチごとのスケールシフトフィッティングとジオメトリ蒸留によって、単眼の深さのあいまいさの問題を克服します。これにより、MDE ネットワークを適応させて、NeRF ジオメトリと正確に位置合わせされた深度が生成されます。実験では、私たちのフレームワークが定量的および定性的の両方で最先端の結果を達成し、屋内と屋外の両方の現実世界のデータセットで一貫した信頼性の高いパフォーマンスを実証していることが示されています。プロジェクトページは https://ku-cvlab.github.io/DaRF/ から入手できます。

Neural radiance fields (NeRF) shows powerful performance in novel view synthesis and 3D geometry reconstruction, but it suffers from critical performance degradation when the number of known viewpoints is drastically reduced. Existing works attempt to overcome this problem by employing external priors, but their success is limited to certain types of scenes or datasets. Employing monocular depth estimation (MDE) networks, pretrained on large-scale RGB-D datasets, with powerful generalization capability would be a key to solving this problem: however, using MDE in conjunction with NeRF comes with a new set of challenges due to various ambiguity problems exhibited by monocular depths. In this light, we propose a novel framework, dubbed DäRF, that achieves robust NeRF reconstruction with a handful of real-world images by combining the strengths of NeRF and monocular depth estimation through online complementary training. Our framework imposes the MDE network's powerful geometry prior to NeRF representation at both seen and unseen viewpoints to enhance its robustness and coherence. In addition, we overcome the ambiguity problems of monocular depths through patch-wise scale-shift fitting and geometry distillation, which adapts the MDE network to produce depths aligned accurately with NeRF geometry. Experiments show our framework achieves state-of-the-art results both quantitatively and qualitatively, demonstrating consistent and reliable performance in both indoor and outdoor real-world datasets. Project page is available at https://ku-cvlab.github.io/DaRF/.

updated: Tue May 30 2023 16:46:41 GMT+0000 (UTC)

published: Tue May 30 2023 16:46:41 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト