Learning Graph Convolutional Network for Skeleton-based Human Action Recognition by Neural Searching

Wei Peng; Xiaopeng Hong; Haoyu Chen; Guoying Zhao

神経探索によるスケルトンベースの人間行動認識のための学習グラフ畳み込みネットワーク

グラフ畳み込みネットワーク（GCN）に支えられたスケルトンデータからの人間の行動認識は、非ユークリッド構造データをモデル化する強力な機能により、多くの注目を集めています。ただし、多くの既存のGCNメソッドは事前定義されたグラフを提供し、ネットワーク全体でそれを修正します。これにより、暗黙的な共同相関が失われる可能性があります。また、主流のスペクトルGCNは1次ホップで近似されるため、高次の接続はあまり関係しません。したがって、より良いGCNアーキテクチャを探索するには、多大な努力が必要です。これらの問題に対処するために、Neural Architecture Search（NAS）に目を向け、スケルトンベースのアクション認識用に最初に自動設計されたGCNを提案します。具体的には、ノード間の空間的時間的相関を完全に調査した後、複数の動的なグラフモジュールを提供することにより、検索スペースを強化します。さらに、マルチホップモジュールを導入し、1次近似によって引き起こされる表現能力の制限を破ることを期待しています。さらに、このタスクに最適なアーキテクチャを検索するために、サンプリングおよびメモリ効率の高い進化戦略が提案されています。結果として得られたアーキテクチャは、高次近似の有効性と、時間的な相互作用を伴う動的なグラフモデリングメカニズムを証明しています。検索されたモデルのパフォーマンスを評価するために、2つの非常に大規模なデータセットで大規模な実験を実施し、結果は、モデルが最新の結果を取得することを示しています。

Human action recognition from skeleton data, fueled by the Graph Convolutional Network (GCN), has attracted lots of attention, due to its powerful capability of modeling non-Euclidean structure data. However, many existing GCN methods provide a pre-defined graph and fix it through the entire network, which can loss implicit joint correlations. Besides, the mainstream spectral GCN is approximated by one-order hop, thus higher-order connections are not well involved. Therefore, huge efforts are required to explore a better GCN architecture. To address these problems, we turn to Neural Architecture Search (NAS) and propose the first automatically designed GCN for skeleton-based action recognition. Specifically, we enrich the search space by providing multiple dynamic graph modules after fully exploring the spatial-temporal correlations between nodes. Besides, we introduce multiple-hop modules and expect to break the limitation of representational capacity caused by one-order approximation. Moreover, a sampling- and memory-efficient evolution strategy is proposed to search an optimal architecture for this task. The resulted architecture proves the effectiveness of the higher-order approximation and the dynamic graph modeling mechanism with temporal interactions, which is barely discussed before. To evaluate the performance of the searched model, we conduct extensive experiments on two very large scaled datasets and the results show that our model gets the state-of-the-art results.

updated: Mon Nov 11 2019 08:24:10 GMT+0000 (UTC)

published: Mon Nov 11 2019 08:24:10 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト