Symbiotic Graph Neural Networks for 3D Skeleton-based Human Action   Recognition and Motion Prediction

Maosen Li; Siheng Chen; Xu Chen; Ya Zhang; Yanfeng Wang; Qi Tian

3Dスケルトンベースの人間の行動認識と動き予測のための共生グラフニューラルネットワーク

Symbiotic Graph Neural Networks for 3D Skeleton-based Human Action Recognition and Motion Prediction

3Dスケルトンベースのアクション認識とモーション予測は、人間の活動を理解する上で重要な2つの問題です。多くの以前の作品では、1）内部相関を無視して、2つのタスクを別々に研究しました。 2）体内の十分な関係を捕捉しなかった。これらの問題に対処するために、2つのタスクを共同で処理する共生モデルを提案します。また、身体の関節と身体の部分の間の関係を明示的にキャプチャするために、2つのスケールのグラフを提案します。一緒に、バックボーン、アクション認識ヘッド、および動き予測ヘッドを含む共生グラフニューラルネットワークを提案します。 2つのヘッドが共同でトレーニングされ、お互いを強化します。バックボーンについては、マルチブランチマルチスケールグラフコンボリューションネットワークを提案して、空間的および時間的特徴を抽出します。マルチスケールグラフコンボリューションネットワークは、ジョイントスケールグラフとパートスケールグラフに基づいています。ジョイントスケールグラフには、アクションベースの関係をキャプチャするアクショングラフと、物理的制約をキャプチャする構造グラフが含まれています。パーツスケールグラフは、高レベルの関係を表す特定のパーツを形成するためにボディジョイントを統合します。さらに、補完的な機能を学習するために、デュアルボーンベースのグラフとネットワークが提案されています。 NTU-RGB + D、Kinetics、Human3.6M、およびCMU Mocapの4つのデータセットを使用して、スケルトンベースのアクション認識とモーション予測の広範な実験を実施しています。実験により、共生グラフニューラルネットワークは、最先端の方法と比較して、両方のタスクで優れたパフォーマンスを達成することがわかります。

3D skeleton-based action recognition and motion prediction are two essential problems of human activity understanding. In many previous works: 1) they studied two tasks separately, neglecting internal correlations; 2) they did not capture sufficient relations inside the body. To address these issues, we propose a symbiotic model to handle two tasks jointly; and we propose two scales of graphs to explicitly capture relations among body-joints and body-parts. Together, we propose symbiotic graph neural networks, which contain a backbone, an action-recognition head, and a motion-prediction head. Two heads are trained jointly and enhance each other. For the backbone, we propose multi-branch multi-scale graph convolution networks to extract spatial and temporal features. The multi-scale graph convolution networks are based on joint-scale and part-scale graphs. The joint-scale graphs contain actional graphs, capturing action-based relations, and structural graphs, capturing physical constraints. The part-scale graphs integrate body-joints to form specific parts, representing high-level relations. Moreover, dual bone-based graphs and networks are proposed to learn complementary features. We conduct extensive experiments for skeleton-based action recognition and motion prediction with four datasets, NTU-RGB+D, Kinetics, Human3.6M, and CMU Mocap. Experiments show that our symbiotic graph neural networks achieve better performances on both tasks compared to the state-of-the-art methods.

updated: Sat Oct 05 2019 05:29:03 GMT+0000 (UTC)

published: Sat Oct 05 2019 05:29:03 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト