NeuralMVS: Bridging Multi-View Stereo and Novel View Synthesis

Radu Alexandru Rosu; Sven Behnke

NeuralMVS：マルチビューステレオと新しいビュー合成の橋渡し

マルチビューステレオ（MVS）は、3Dコンピュータービジョンのコアタスクです。新しいディープラーニング手法の急増により、学習したMVSは従来のアプローチの精度を上回りましたが、それでもメモリを大量に消費する高密度のコストボリュームの構築に依存しています。ノベルビューシンセシス（NVS）は並行した研究であり、最近、シーンごとの放射輝度フィールドを最適化するニューラル放射輝度フィールド（NeRF）モデルの人気が高まっています。ただし、NeRFメソッドは新しいシーンに一般化されておらず、トレーニングとテストに時間がかかります。高解像度のカラー画像とともに、距離関数として3Dシーンのジオメトリを復元できる新しいネットワークを使用して、これら2つの方法論間のギャップを埋めることを提案します。私たちの方法は、入力としてまばらな画像のセットのみを使用し、新しいシーンにうまく一般化することができます。さらに、速度を大幅に向上させるために、粗い球から細かい球へのトレースアプローチを提案します。さまざまなデータセットで、私たちの方法がシーンごとに最適化された方法と同等の精度に達し、一般化して大幅に高速化できることを示しています。

Multi-View Stereo (MVS) is a core task in 3D computer vision. With the surge of novel deep learning methods, learned MVS has surpassed the accuracy of classical approaches, but still relies on building a memory intensive dense cost volume. Novel View Synthesis (NVS) is a parallel line of research and has recently seen an increase in popularity with Neural Radiance Field (NeRF) models, which optimize a per scene radiance field. However, NeRF methods do not generalize to novel scenes and are slow to train and test. We propose to bridge the gap between these two methodologies with a novel network that can recover 3D scene geometry as a distance function, together with high-resolution color images. Our method uses only a sparse set of images as input and can generalize well to novel scenes. Additionally, we propose a coarse-to-fine sphere tracing approach in order to significantly increase speed. We show on various datasets that our method reaches comparable accuracy to per-scene optimized methods while being able to generalize and running significantly faster.

updated: Mon Aug 09 2021 08:59:24 GMT+0000 (UTC)

published: Mon Aug 09 2021 08:59:24 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト