Masked Wavelet Representation for Compact Neural Radiance Fields

Daniel Rho; Byeonghyeon Lee; Seungtae Nam; Joo Chan Lee; Jong Hwan Ko; Eunbyung Park

コンパクトニューラルラディアンスフィールドのマスクされたウェーブレット表現

ニューラルラディアンスフィールド (NeRF) は、ニューラルレンダリングにおける座標ベースのニューラル表現 (ニューラルフィールドまたは暗黙的ニューラル表現) の可能性を実証しています。ただし、多層パーセプトロン (MLP) を使用して 3D シーンまたはオブジェクトを表現するには、膨大な計算リソースと時間が必要です。グリッドやツリーなどの追加のデータ構造を使用して、これらの計算の非効率性を軽減する方法に関する最近の研究が行われています。有望なパフォーマンスにもかかわらず、明示的なデータ構造にはかなりの量のメモリが必要です。この作業では、追加のデータ構造を持つ利点を損なうことなくサイズを縮小する方法を提示します。詳細には、グリッドベースのニューラルフィールドでウェーブレット変換を使用することを提案します。グリッドベースのニューラルフィールドは収束を高速化するためのものであり、高性能の標準コーデックで効率が実証されているウェーブレット変換は、グリッドのパラメーター効率を改善するためのものです。さらに、再構成の品質を維持しながらグリッド係数のより高いスパース性を実現するために、新しいトレーニング可能なマスキングアプローチを提示します。実験結果は、ウェーブレット係数などの非空間グリッド係数が空間グリッド係数よりも高いレベルのスパース性を達成できることを示しており、よりコンパクトな表現が得られます。提案したマスクと圧縮パイプラインにより、2 MB のメモリバジェット内で最先端のパフォーマンスを達成しました。コードは https://github.com/daniel03c1/masked_wavelet_nerf で入手できます。

Neural radiance fields (NeRF) have demonstrated the potential of coordinate-based neural representation (neural fields or implicit neural representation) in neural rendering. However, using a multi-layer perceptron (MLP) to represent a 3D scene or object requires enormous computational resources and time. There have been recent studies on how to reduce these computational inefficiencies by using additional data structures, such as grids or trees. Despite the promising performance, the explicit data structure necessitates a substantial amount of memory. In this work, we present a method to reduce the size without compromising the advantages of having additional data structures. In detail, we propose using the wavelet transform on grid-based neural fields. Grid-based neural fields are for fast convergence, and the wavelet transform, whose efficiency has been demonstrated in high-performance standard codecs, is to improve the parameter efficiency of grids. Furthermore, in order to achieve a higher sparsity of grid coefficients while maintaining reconstruction quality, we present a novel trainable masking approach. Experimental results demonstrate that non-spatial grid coefficients, such as wavelet coefficients, are capable of attaining a higher level of sparsity than spatial grid coefficients, resulting in a more compact representation. With our proposed mask and compression pipeline, we achieved state-of-the-art performance within a memory budget of 2 MB. Our code is available at https://github.com/daniel03c1/masked_wavelet_nerf.

updated: Tue Mar 21 2023 10:23:40 GMT+0000 (UTC)

published: Sun Dec 18 2022 11:43:32 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト