TTPOINT: A Tensorized Point Cloud Network for Lightweight Action Recognition with Event Cameras

Hongwei Ren; Yue Zhou; Haotian Fu; Yulong Huang; Renjing Xu; Bojun Cheng

TTPOINT: イベントカメラを使用した軽量アクション認識のためのテンソル化点群ネットワーク

イベントカメラは、データの希薄性、高ダイナミックレンジ、低遅延により、コンピュータビジョンで人気を集めています。生物由来のセンサーとして、イベントカメラはまばらな非同期データを生成しますが、これは従来のフレームベースの方法とは本質的に互換性がありません。あるいは、ポイントベースの方法では、追加のモダリティ変換を回避し、イベントのまばらさに自然に適応できます。それでも、通常はフレームベースの方法と同等の精度に達することはできません。我々は、TTPOINT と呼ばれる軽量かつ一般化された点群ネットワークを提案します。これは、計算リソースの 1.5 % のみを使用しながら、動作認識タスクにおける最先端 (SOTA) フレームベースの方法と比較しても、競争力のある結果を達成します。このモデルは、階層構造によってローカルおよびグローバルのジオメトリを抽象化することに優れています。テンソルトレイン圧縮特徴抽出器を活用することで、最小限のパラメーターと計算の複雑さで TTPOINT を設計できます。さらに、時空間特徴を維持するための簡単なダウンサンプリングアルゴリズムを開発しました。実験では、TTPOINT は 3 つのデータセットで SOTA 手法として登場し、5 つすべてのデータセットで点群手法の中で SOTA を達成しました。さらに、テンソルトレイン分解法を使用することにより、5 つのデータセットすべてでパラメーターサイズを 55 % 圧縮しながら、提案された TTPOINT の精度はほとんど影響を受けません。

Event cameras have gained popularity in computer vision due to their data sparsity, high dynamic range, and low latency. As a bio-inspired sensor, event cameras generate sparse and asynchronous data, which is inherently incompatible with the traditional frame-based method. Alternatively, the point-based method can avoid additional modality transformation and naturally adapt to the sparsity of events. Still, it typically cannot reach a comparable accuracy as the frame-based method. We propose a lightweight and generalized point cloud network called TTPOINT which achieves competitive results even compared to the state-of-the-art (SOTA) frame-based method in action recognition tasks while only using 1.5 % of the computational resources. The model is adept at abstracting local and global geometry by hierarchy structure. By leveraging tensor-train compressed feature extractors, TTPOINT can be designed with minimal parameters and computational complexity. Additionally, we developed a straightforward downsampling algorithm to maintain the spatio-temporal feature. In the experiment, TTPOINT emerged as the SOTA method on three datasets while also attaining SOTA among point cloud methods on all five datasets. Moreover, by using the tensor-train decomposition method, the accuracy of the proposed TTPOINT is almost unaffected while compressing the parameter size by 55 % in all five datasets.

updated: Sat Aug 19 2023 11:58:31 GMT+0000 (UTC)

published: Sat Aug 19 2023 11:58:31 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト