Remote Sensing Novel View Synthesis with Implicit Multiplane Representations

Yongchang Wu; Zhengxia Zou; Zhenwei Shi

暗黙のマルチプレーン表現によるリモートセンシングの新しいビュー合成

リモートセンシングシーンの新しいビュー合成は、シーンの視覚化、人間とコンピューターの相互作用、およびさまざまなダウンストリームアプリケーションにとって非常に重要です。コンピュータグラフィックスと写真測量技術の最近の進歩にもかかわらず、新しいビューを生成することは、その高度な複雑さ、ビューの希薄性、および限られたビューパースペクティブのバリエーションのために、特にリモートセンシング画像にとって依然として困難です。本論文では、暗黙の神経表現における最近の進歩を活用することにより、新しいリモートセンシングビュー合成法を提案した。リモートセンシング画像のオーバーヘッドと遠深度の画像を考慮して、暗黙のマルチプレーン画像（MPI）表現とディープニューラルネットワークを組み合わせて3D空間を表現します。 3Dシーンは、マルチビュー入力制約のある微分可能なマルチプレーンレンダラーを介して、自己監視最適化パラダイムの下で再構築されます。したがって、新しいビューからの画像は、再構築されたモデルに基づいて自由にレンダリングできます。副産物として、特定の視点に対応する深度マップをレンダリング出力とともに生成できます。この方法をImplicitMultiplaneImages（ImMPI）と呼びます。スパースビュー入力の下でのビュー合成をさらに改善するために、リモートセンシング3Dシーンの学習ベースの初期化を調査し、最適化プロセスを加速するニューラルネットワークベースの事前抽出器を提案しました。さらに、マルチビューの実世界のグーグルアース画像を使用したリモートセンシングの新しいビュー合成のための新しいデータセットを提案します。広範な実験により、再構成の精度、視覚的忠実度、および時間効率の点で、以前の最先端の方法に対するImMPIの優位性が実証されています。アブレーション実験はまた、私たちの方法論設計の有効性を示唆しています。私たちのデータセットとコードはhttps://github.com/wyc-Chang/ImMPIにあります

Novel view synthesis of remote sensing scenes is of great significance for scene visualization, human-computer interaction, and various downstream applications. Despite the recent advances in computer graphics and photogrammetry technology, generating novel views is still challenging particularly for remote sensing images due to its high complexity, view sparsity and limited view-perspective variations. In this paper, we propose a novel remote sensing view synthesis method by leveraging the recent advances in implicit neural representations. Considering the overhead and far depth imaging of remote sensing images, we represent the 3D space by combining implicit multiplane images (MPI) representation and deep neural networks. The 3D scene is reconstructed under a self-supervised optimization paradigm through a differentiable multiplane renderer with multi-view input constraints. Images from any novel views thus can be freely rendered on the basis of the reconstructed model. As a by-product, the depth maps corresponding to the given viewpoint can be generated along with the rendering output. We refer to our method as Implicit Multiplane Images (ImMPI). To further improve the view synthesis under sparse-view inputs, we explore the learning-based initialization of remote sensing 3D scenes and proposed a neural network based Prior extractor to accelerate the optimization process. In addition, we propose a new dataset for remote sensing novel view synthesis with multi-view real-world google earth images. Extensive experiments demonstrate the superiority of the ImMPI over previous state-of-the-art methods in terms of reconstruction accuracy, visual fidelity, and time efficiency. Ablation experiments also suggest the effectiveness of our methodology design. Our dataset and code can be found at https://github.com/wyc-Chang/ImMPI

updated: Wed May 18 2022 13:03:55 GMT+0000 (UTC)

published: Wed May 18 2022 13:03:55 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト