Multi-person 3D pose estimation from unlabelled data

Daniel Rodriguez-Criado; Pilar Bachiller; George Vogiatzis; Luis J. Manso

ラベルなしデータからの複数人の 3D 姿勢推定

その多数のアプリケーションにより、複数の人間の 3D 姿勢推定が非常に影響力のある研究分野になっています。それにもかかわらず、複数の通常の RGB カメラで構成されるマルチビューシステムを想定すると、3D マルチポーズ推定にはいくつかの課題があります。まず第一に、カメラによって提供される 2D 情報を分離するために、各人物をさまざまなビューで一意に識別する必要があります。次に、各人物の多視点 2D 情報からの 3D 姿勢推定プロセスは、シナリオ内のノイズや潜在的な閉塞に対して堅牢でなければなりません。この作業では、ディープラーニングの助けを借りて、これら 2 つの課題に対処します。具体的には、シナリオ内の人々のクロスビュー対応を予測できるグラフニューラルネットワークに基づくモデルと、2D ポイントを取得して各人の 3D ポーズを生成する多層パーセプトロンを提示します。これら 2 つのモデルは、自己管理型の方法でトレーニングされるため、3D 注釈を含む大規模なデータセットの必要性が回避されます。

Its numerous applications make multi-human 3D pose estimation a remarkably impactful area of research. Nevertheless, assuming a multiple-view system composed of several regular RGB cameras, 3D multi-pose estimation presents several challenges. First of all, each person must be uniquely identified in the different views to separate the 2D information provided by the cameras. Secondly, the 3D pose estimation process from the multi-view 2D information of each person must be robust against noise and potential occlusions in the scenario. In this work, we address these two challenges with the help of deep learning. Specifically, we present a model based on Graph Neural Networks capable of predicting the cross-view correspondence of the people in the scenario along with a Multilayer Perceptron that takes the 2D points to yield the 3D poses of each person. These two models are trained in a self-supervised manner, thus avoiding the need for large datasets with 3D annotations.

updated: Tue Apr 09 2024 17:52:49 GMT+0000 (UTC)

published: Fri Dec 16 2022 22:03:37 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト