Faster Learning of Temporal Action Proposal via Sparse Multilevel Boundary Generator

Qing Song; Yang Zhou; Mengjie Hu; Chun Liu

スパースマルチレベル境界ジェネレーターによる時間アクション提案の高速学習

ビデオでの一時的なアクションのローカリゼーションは、コンピュータービジョンの分野で大きな課題を提示します。境界に敏感な方法は広く採用されていますが、その制限には、中間およびグローバル情報の不完全な使用、および非効率的な提案機能ジェネレーターが含まれます。これらの課題に対処するために、境界分類とアクション完全性回帰を使用して境界に敏感な方法を強化する新しいフレームワーク、スパースマルチレベル境界ジェネレーター (SMBG) を提案します。 SMBG は、さまざまな長さで境界情報を収集することにより、より高速な処理を可能にするマルチレベル境界モジュールを備えています。さらに、アクションの内側と外側の情報を区別するスパース抽出コンフィデンスヘッドを導入し、提案機能ジェネレーターをさらに最適化します。複数の支店間の相乗効果を改善し、ポジティブサンプルとネガティブサンプルのバランスを取るために、グローバルガイダンスロスを提案します。私たちの方法は、ActivityNet-1.3 と THUMOS14 という 2 つの一般的なベンチマークで評価され、より優れた推論速度 (2.47xBSN++、2.12xDBG) で最先端のパフォーマンスを達成することが示されています。これらの結果は、SMBG が一時的なアクション提案を生成するためのより効率的でシンプルなソリューションを提供することを示しています。私たちが提案するフレームワークは、コンピュータービジョンの分野を前進させ、ビデオ分析における時間的アクションローカリゼーションの精度と速度を向上させる可能性を秘めています。コードとモデルは、https://github.com/zhouyang-001/SMBG-for で入手できます。 -一時的な行動提案。

Temporal action localization in videos presents significant challenges in the field of computer vision. While the boundary-sensitive method has been widely adopted, its limitations include incomplete use of intermediate and global information, as well as an inefficient proposal feature generator. To address these challenges, we propose a novel framework, Sparse Multilevel Boundary Generator (SMBG), which enhances the boundary-sensitive method with boundary classification and action completeness regression. SMBG features a multi-level boundary module that enables faster processing by gathering boundary information at different lengths. Additionally, we introduce a sparse extraction confidence head that distinguishes information inside and outside the action, further optimizing the proposal feature generator. To improve the synergy between multiple branches and balance positive and negative samples, we propose a global guidance loss. Our method is evaluated on two popular benchmarks, ActivityNet-1.3 and THUMOS14, and is shown to achieve state-of-the-art performance, with a better inference speed (2.47xBSN++, 2.12xDBG). These results demonstrate that SMBG provides a more efficient and simple solution for generating temporal action proposals. Our proposed framework has the potential to advance the field of computer vision and enhance the accuracy and speed of temporal action localization in video analysis.The code and models are made available at https://github.com/zhouyang-001/SMBG-for-temporal-action-proposal.

updated: Mon Mar 06 2023 14:26:56 GMT+0000 (UTC)

published: Mon Mar 06 2023 14:26:56 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト