DPMix: Mixture of Depth and Point Cloud Video Experts for 4D Action Segmentation

Yue Zhang; Hehe Fan; Yi Yang; Mohan Kankanhalli

DPMix: 4D アクションセグメンテーションのための深度および点群ビデオエキスパートの混合

この技術レポートでは、自己中心的なアクションセグメンテーションタスク用の Human-Object Interaction 4D (HOI4D) データセットに対して実施された調査結果を紹介します。比較的新しい研究分野として、点群ビデオ法は、特に長い点群ビデオ（例えば、１５０フレーム）の場合、時間モデリングには不向きである可能性がある。対照的に、従来のビデオ理解方法は十分に開発されています。時間モデリングにおけるその有効性は、多くの大規模ビデオデータセットで広く検証されています。したがって、点群ビデオを深度ビデオに変換し、従来のビデオモデリング手法を採用して 4D アクションセグメンテーションを改善します。深度および点群ビデオ手法をアンサンブルすることにより、精度が大幅に向上します。深度および点群ビデオエキスパートの混合 (DPMix) と名付けられた提案手法は、HOI4D Challenge 2023 の 4D アクションセグメンテーショントラックで 1 位を獲得しました。

In this technical report, we present our findings from the research conducted on the Human-Object Interaction 4D (HOI4D) dataset for egocentric action segmentation task. As a relatively novel research area, point cloud video methods might not be good at temporal modeling, especially for long point cloud videos (e.g. , 150 frames). In contrast, traditional video understanding methods have been well developed. Their effectiveness on temporal modeling has been widely verified on many large scale video datasets. Therefore, we convert point cloud videos into depth videos and employ traditional video modeling methods to improve 4D action segmentation. By ensembling depth and point cloud video methods, the accuracy is significantly improved. The proposed method, named Mixture of Depth and Point cloud video experts (DPMix), achieved the first place in the 4D Action Segmentation Track of the HOI4D Challenge 2023.

updated: Mon Jul 31 2023 16:14:24 GMT+0000 (UTC)

published: Mon Jul 31 2023 16:14:24 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト