End-to-end View Synthesis via NeRF Attention

Zelin Zhao; Jiaya Jia

NeRF Attention によるエンドツーエンドのビュー合成

この論文では、ビュー合成のための単純な seq2seq 式を提示します。ここでは、光線に対応する入力および出力カラーとして一連の光線ポイントを使用します。この seq2seq 式に標準変換を直接適用することには、2 つの制限があります。まず、標準的なアテンションはボリュームレンダリング手順にうまく適合できないため、合成されたビューでは高周波成分が失われます。第 2 に、すべてのレイとピクセルにグローバルな注意を適用するのは非常に非効率的です。ニューラルラディアンスフィールド (NeRF) に着想を得て、上記の問題に対処するために NeRF Attention (NeRFA) を提案します。一方では、NeRFA はボリュームレンダリング方程式をソフトフィーチャ変調手順と見なします。このように、機能変調は、NeRF のような誘導性バイアスでトランスを強化します。一方、NeRFA は多段階のアテンションを実行して計算オーバーヘッドを削減します。さらに、NeRFA モデルは、レイとピクセルのトランスフォーマーを採用して、レイとピクセル間の相互作用を学習します。 NeRFA は、DeepVoxels、Blender、LLFF、および CO3D の 4 つのデータセットで、NeRF および NerFormer よりも優れたパフォーマンスを発揮します。さらに、NeRFA は、単一シーンビュー合成とカテゴリ中心の小説ビュー合成という 2 つの設定の下で、新しい最先端技術を確立します。コードは公開されます。

In this paper, we present a simple seq2seq formulation for view synthesis where we take a set of ray points as input and output colors corresponding to the rays. Directly applying a standard transformer on this seq2seq formulation has two limitations. First, the standard attention cannot successfully fit the volumetric rendering procedure, and therefore high-frequency components are missing in the synthesized views. Second, applying global attention to all rays and pixels is extremely inefficient. Inspired by the neural radiance field (NeRF), we propose the NeRF attention (NeRFA) to address the above problems. On the one hand, NeRFA considers the volumetric rendering equation as a soft feature modulation procedure. In this way, the feature modulation enhances the transformers with the NeRF-like inductive bias. On the other hand, NeRFA performs multi-stage attention to reduce the computational overhead. Furthermore, the NeRFA model adopts the ray and pixel transformers to learn the interactions between rays and pixels. NeRFA demonstrates superior performance over NeRF and NerFormer on four datasets: DeepVoxels, Blender, LLFF, and CO3D. Besides, NeRFA establishes a new state-of-the-art under two settings: the single-scene view synthesis and the category-centric novel view synthesis. The code will be made publicly available.

updated: Fri Jul 29 2022 15:26:16 GMT+0000 (UTC)

published: Fri Jul 29 2022 15:26:16 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト