Topologically-Aware Deformation Fields for Single-View 3D Reconstruction

Shivam Duggal; Deepak Pathak

シングルビュー3D再構成のためのトポロジカルアウェア変形フィールド

整列されていないカテゴリ固有の画像コレクションから、3Dオブジェクトの形状と高密度のクロスオブジェクト3D対応を学習するための新しいフレームワークを紹介します。 3D形状は、カテゴリ固有の符号付き距離フィールドへの変形として暗黙的に生成され、3D監視なしで、位置合わせされていない画像コレクションからのみ監視なしの方法で学習されます。一般に、インターネット上の画像コレクションには、カテゴリ内の幾何学的および位相的バリエーションがいくつか含まれています。たとえば、椅子が異なればトポロジーも異なる可能性があるため、関節の形状と対応の推定のタスクははるかに困難になります。このため、以前の作業では、インスタンス間の対応をモデル化せずに各3Dオブジェクトの形状を個別に学習することに焦点を当てるか、カテゴリ内のトポロジの変動が最小限のカテゴリでジョイントの形状と対応の推定を実行します。オブジェクト空間の3Dポイントをカテゴリ固有の正規空間の高次元ポイントにマッピングする、トポロジを意識した暗黙の変形フィールドを学習することで、これらの制限を克服します。推論時に、単一の画像が与えられた場合、まずオブジェクト空間の各3Dポイントを、トポロジを意識した変形フィールドを使用して学習したカテゴリ固有の正規空間に暗黙的に変形し、次に3D形状を正規として再構築することにより、基礎となる3D形状を再構築します。符号付き距離フィールド。正規の形状と変形フィールドの両方が、微分可能なレンダリングモジュールとして学習された再帰光線マーチャー（SRN）を使用して、逆グラフィックス方式でエンドツーエンドで学習されます。 TARSと呼ばれる私たちのアプローチは、ShapeNet、Pascal3D +、CUB、およびPix3Dチェアなど、いくつかのデータセットで最先端の再構築の忠実度を実現します。結果のビデオとコードはhttps://shivamduggal4.github.io/tars-3D/にあります

We present a new framework for learning 3D object shapes and dense cross-object 3D correspondences from just an unaligned category-specific image collection. The 3D shapes are generated implicitly as deformations to a category-specific signed distance field and are learned in an unsupervised manner solely from unaligned image collections without any 3D supervision. Generally, image collections on the internet contain several intra-category geometric and topological variations, for example, different chairs can have different topologies, which makes the task of joint shape and correspondence estimation much more challenging. Because of this, prior works either focus on learning each 3D object shape individually without modeling cross-instance correspondences or perform joint shape and correspondence estimation on categories with minimal intra-category topological variations. We overcome these restrictions by learning a topologically-aware implicit deformation field that maps a 3D point in the object space to a higher dimensional point in the category-specific canonical space. At inference time, given a single image, we reconstruct the underlying 3D shape by first implicitly deforming each 3D point in the object space to the learned category-specific canonical space using the topologically-aware deformation field and then reconstructing the 3D shape as a canonical signed distance field. Both canonical shape and deformation field are learned end-to-end in an inverse-graphics fashion using a learned recurrent ray marcher (SRN) as a differentiable rendering module. Our approach, dubbed TARS, achieves state-of-the-art reconstruction fidelity on several datasets: ShapeNet, Pascal3D+, CUB, and Pix3D chairs. Result videos and code at https://shivamduggal4.github.io/tars-3D/

updated: Thu May 12 2022 17:59:59 GMT+0000 (UTC)

published: Thu May 12 2022 17:59:59 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト