Self-Supervised Visibility Learning for Novel View Synthesis

Yujiao Shi; Hongdong Li; Xin Yu

新規ビュー合成のための自己教師あり可視性学習

いくつかのスパースソースビュー画像からの新規ビュー合成（NVS）の問題に対処します。従来の画像ベースのレンダリング方法は、シーンジオメトリを推定し、2つの別々のステップで新しいビューを合成します。ただし、ビューの合成は推定されたシーンジオメトリの品質に大きく依存するため、誤ったジオメトリ推定はNVSのパフォーマンスを低下させます。このホワイトペーパーでは、エラー伝播の問題を排除するためのエンドツーエンドのNVSフレームワークを提案します。具体的には、ターゲットビューの下にボリュームを構築し、ソースビューの可視性推定（SVE）モジュールを設計して、各ソースビューのターゲットビューボクセルの可視性を決定します。次に、すべてのソースビューの可視性を集約して、コンセンサスボリュームを実現します。コンセンサスボリューム内の各ボクセルは、表面存在確率を示します。次に、ソフトレイキャスティング（SRC）メカニズムを提示して、ターゲットビュー（つまり深度）の最前面を見つけます。具体的には、SRCは視線に沿ってコンセンサスボリュームをトラバースし、深度確率分布を推定します。次に、ソースビューのピクセルをワープおよび集約して、推定されたソースビューの可視性とターゲットビューの深さに基づいて新しいビューを合成します。最後に、私たちのネットワークはエンドツーエンドの自己監視方式でトレーニングされているため、ビュー合成でのエラーの蓄積が大幅に軽減されます。実験結果は、私たちの方法が最先端のものと比較してより高品質で新しいビューを生成することを示しています。

We address the problem of novel view synthesis (NVS) from a few sparse source view images. Conventional image-based rendering methods estimate scene geometry and synthesize novel views in two separate steps. However, erroneous geometry estimation will decrease NVS performance as view synthesis highly depends on the quality of estimated scene geometry. In this paper, we propose an end-to-end NVS framework to eliminate the error propagation issue. To be specific, we construct a volume under the target view and design a source-view visibility estimation (SVE) module to determine the visibility of the target-view voxels in each source view. Next, we aggregate the visibility of all source views to achieve a consensus volume. Each voxel in the consensus volume indicates a surface existence probability. Then, we present a soft ray-casting (SRC) mechanism to find the most front surface in the target view (i.e. depth). Specifically, our SRC traverses the consensus volume along viewing rays and then estimates a depth probability distribution. We then warp and aggregate source view pixels to synthesize a novel view based on the estimated source-view visibility and target-view depth. At last, our network is trained in an end-to-end self-supervised fashion, thus significantly alleviating error accumulation in view synthesis. Experimental results demonstrate that our method generates novel views in higher quality compared to the state-of-the-art.

updated: Sun Apr 04 2021 06:13:53 GMT+0000 (UTC)

published: Mon Mar 29 2021 08:11:25 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト