Learning Visibility for Robust Dense Human Body Estimation

Chun-Han Yao; Jimei Yang; Duygu Ceylan; Yi Zhou; Yang Zhou; Ming-Hsuan Yang

ロバストで高密度の人体推定のための可視性の学習

2D 画像から 3D の人間の姿勢と形状を推定することは、重要でありながら困難な作業です。モデルベースの表現を使用した従来の方法は、全身の画像ではかなりうまく機能しますが、体の一部が遮られているか、フレームの外にある場合は失敗することがよくあります。さらに、これらの結果は通常、変形可能なモデルの表現力が限られているため (たとえば、裸の身体のみを表現する)、人間のシルエットを忠実に捉えることはできません。別のアプローチは、画像空間で定義済みのテンプレートボディの密な頂点を推定することです。このような表現は、画像内の頂点をローカライズするのに効果的ですが、フレーム外の身体部分を処理することはできません。この作業では、部分的な観測にロバストな高密度の人体推定を学習します。 x、y、z 軸で人間の関節と頂点の可視性を明示的にモデル化します。 x 軸と y 軸の可視性はフレーム外のケースを区別するのに役立ち、深度軸の可視性はオクルージョン (自己オクルージョンまたは他のオブジェクトによるオクルージョン) に対応します。高密度の UV 対応から可視性ラベルの疑似グラウンドトゥルースを取得し、3D 座標と共に可視性を予測するようにニューラルネットワークをトレーニングします。可視性は、1) 自己遮蔽頂点の深さ順序のあいまいさを解決するための追加の信号、および 2) 人体モデルを予測に適合させるときの正則化項として機能できることを示します。複数の 3D ヒューマンデータセットでの広範な実験により、特に部分的な身体の場合に、可視性モデリングが人体推定の精度を大幅に向上させることが実証されています。コード付きのプロジェクトページは https://github.com/chhankyao/visdb にあります。

Estimating 3D human pose and shape from 2D images is a crucial yet challenging task. While prior methods with model-based representations can perform reasonably well on whole-body images, they often fail when parts of the body are occluded or outside the frame. Moreover, these results usually do not faithfully capture the human silhouettes due to their limited representation power of deformable models (e.g., representing only the naked body). An alternative approach is to estimate dense vertices of a predefined template body in the image space. Such representations are effective in localizing vertices within an image but cannot handle out-of-frame body parts. In this work, we learn dense human body estimation that is robust to partial observations. We explicitly model the visibility of human joints and vertices in the x, y, and z axes separately. The visibility in x and y axes help distinguishing out-of-frame cases, and the visibility in depth axis corresponds to occlusions (either self-occlusions or occlusions by other objects). We obtain pseudo ground-truths of visibility labels from dense UV correspondences and train a neural network to predict visibility along with 3D coordinates. We show that visibility can serve as 1) an additional signal to resolve depth ordering ambiguities of self-occluded vertices and 2) a regularization term when fitting a human body model to the predictions. Extensive experiments on multiple 3D human datasets demonstrate that visibility modeling significantly improves the accuracy of human body estimation, especially for partial-body cases. Our project page with code is at: https://github.com/chhankyao/visdb.

updated: Tue Aug 23 2022 00:01:05 GMT+0000 (UTC)

published: Tue Aug 23 2022 00:01:05 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト