Learning Skeletal Articulations with Neural Blend Shapes

Peizhuo Li; Kfir Aberman; Rana Hanocka; Libin Liu; Olga Sorkine-Hornung; Baoquan Chen

ニューラルブレンド形状による骨格関節の学習

モーションキャプチャ（mocap）データを使用して新しく設計されたキャラクターをアニメーション化することは、コンピューターアニメーションにおける長年の問題です。重要な考慮事項は、利用可能なモーションキャプチャデータに対応する必要がある骨格構造と、関節領域の形状変形です。これには、調整されたポーズ固有の改良が必要になることがよくあります。この作業では、高品質のポーズ依存変形を生成する事前定義された骨格構造を使用したエンベロープを使用して、3Dキャラクターを関節運動させるための神経技術を開発します。私たちのフレームワークは、同じアーティキュレーション構造（たとえば、二足歩行または四足歩行）でキャラクターをリグおよびスキンすることを学習し、ネットワークアーキテクチャに目的のスケルトン階層を構築します。さらに、ニューラルブレンド形状を提案します。これは、標準的なリギングとスキニングから生じる悪名高いアーティファクトに対処するために、関節領域の変形品質を改善する一連の修正ポーズ依存形状です。私たちのシステムは、任意の接続性を持つ入力メッシュのニューラルブレンド形状と、入力関節の回転を条件とする重み係数を推定します。グラウンドトゥルースリギングとスキニングパラメータを使用してネットワークを監視する最近の深層学習手法とは異なり、私たちのアプローチでは、トレーニングデータに特定の基礎となる変形モデルがあるとは想定していません。代わりに、トレーニング中に、ネットワークは変形した形状を観察し、間接的な監視を使用して、対応するリグ、スキン、およびブレンド形状を推測することを学習します。推論中に、ネットワークが、3Dアーティストによって作成されたリグのないキャラクターを含む、任意のメッシュ接続を持つ見えないキャラクターに一般化されることを示します。標準のスケルタルアニメーションモデルに準拠しているため、標準のアニメーションソフトウェアやゲームエンジンで直接プラグアンドプレイできます。

Animating a newly designed character using motion capture (mocap) data is a long standing problem in computer animation. A key consideration is the skeletal structure that should correspond to the available mocap data, and the shape deformation in the joint regions, which often requires a tailored, pose-specific refinement. In this work, we develop a neural technique for articulating 3D characters using enveloping with a pre-defined skeletal structure which produces high quality pose dependent deformations. Our framework learns to rig and skin characters with the same articulation structure (e.g., bipeds or quadrupeds), and builds the desired skeleton hierarchy into the network architecture. Furthermore, we propose neural blend shapes--a set of corrective pose-dependent shapes which improve the deformation quality in the joint regions in order to address the notorious artifacts resulting from standard rigging and skinning. Our system estimates neural blend shapes for input meshes with arbitrary connectivity, as well as weighting coefficients which are conditioned on the input joint rotations. Unlike recent deep learning techniques which supervise the network with ground-truth rigging and skinning parameters, our approach does not assume that the training data has a specific underlying deformation model. Instead, during training, the network observes deformed shapes and learns to infer the corresponding rig, skin and blend shapes using indirect supervision. During inference, we demonstrate that our network generalizes to unseen characters with arbitrary mesh connectivity, including unrigged characters built by 3D artists. Conforming to standard skeletal animation models enables direct plug-and-play in standard animation software, as well as game engines.

updated: Thu May 06 2021 05:58:13 GMT+0000 (UTC)

published: Thu May 06 2021 05:58:13 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト