MagicPony: Learning Articulated 3D Animals in the Wild

Shangzhe Wu; Ruining Li; Tomas Jakab; Christian Rupprecht; Andrea Vedaldi

MagicPony: 野生の多関節 3D 動物を学ぶ

入力として単一のテスト画像が与えられた場合、馬のような多関節動物の 3D 形状、関節、視点、テクスチャ、および照明を予測する問題を検討します。 MagicPony と呼ばれる新しい方法を提示します。これは、変形のトポロジーに関する最小限の仮定で、オブジェクトカテゴリの野生の単一ビュー画像から純粋にこの予測子を学習します。その中核にあるのは、神経場とメッシュの長所を組み合わせた、連結された形状と外観の暗黙的および明示的な表現です。モデルがオブジェクトの形状と姿勢を理解できるようにするために、既製の自己監視型ビジョントランスフォーマーによって取得された知識を抽出し、それを 3D モデルに融合します。視点推定における局所最適を克服するために、追加のトレーニングコストがかからない新しい視点サンプリングスキームをさらに導入します。 MagicPony は、この困難なタスクで以前の作業よりも優れており、実際の画像でのみトレーニングされているにもかかわらず、アートの再構築において優れた一般化を示しています。

We consider the problem of predicting the 3D shape, articulation, viewpoint, texture, and lighting of an articulated animal like a horse given a single test image as input. We present a new method, dubbed MagicPony, that learns this predictor purely from in-the-wild single-view images of the object category, with minimal assumptions about the topology of deformation. At its core is an implicit-explicit representation of articulated shape and appearance, combining the strengths of neural fields and meshes. In order to help the model understand an object's shape and pose, we distil the knowledge captured by an off-the-shelf self-supervised vision transformer and fuse it into the 3D model. To overcome local optima in viewpoint estimation, we further introduce a new viewpoint sampling scheme that comes at no additional training cost. MagicPony outperforms prior work on this challenging task and demonstrates excellent generalisation in reconstructing art, despite the fact that it is only trained on real images.

updated: Fri Mar 31 2023 01:39:56 GMT+0000 (UTC)

published: Tue Nov 22 2022 18:59:31 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト