Prompt-Guided Zero-Shot Anomaly Action Recognition using Pretrained Deep Skeleton Features

Fumiaki Sato; Ryo Hachiuma; Taiki Sekii

事前学習済みのディープスケルトン機能を使用したプロンプトガイド付きゼロショット異常アクション認識

この研究では、異常なサンプルを使用せずにビデオレベルの異常な人間行動イベントを教師なしで識別する教師なし異常アクション認識を調査し、同時に従来のスケルトンベースのアプローチの 3 つの制限に対処します: ターゲットドメイン依存の DNN トレーニング、スケルトンに対するロバスト性エラー、および正常なサンプルの欠如。大規模なアクション認識データセットで事前トレーニングされた、ターゲットドメインに依存しないスケルトン特徴抽出器を使用して、統一されたユーザープロンプトガイドのゼロショット学習フレームワークを提示します。特に、通常のサンプルを使用したトレーニング段階では、DNN の重みを凍結しながら通常のアクションの骨格特徴の分布をモデル化し、推論段階でこの分布を使用して異常スコアを推定します。さらに、スケルトンエラーに対するロバスト性を高めるために、ポイントクラウドディープラーニングパラダイムに着想を得た DNN アーキテクチャを導入します。さらに、観測されていない正常な動作が異常な動作と誤認されるのを防ぐために、共通空間に配置されたユーザープロンプトの埋め込みとスケルトン機能間の類似性スコアを異常スコアに組み込み、正常な動作を間接的に補完します。公開されている 2 つのデータセットで、上記の制限に関して提案された方法の有効性をテストするための実験を行います。

This study investigates unsupervised anomaly action recognition, which identifies video-level abnormal-human-behavior events in an unsupervised manner without abnormal samples, and simultaneously addresses three limitations in the conventional skeleton-based approaches: target domain-dependent DNN training, robustness against skeleton errors, and a lack of normal samples. We present a unified, user prompt-guided zero-shot learning framework using a target domain-independent skeleton feature extractor, which is pretrained on a large-scale action recognition dataset. Particularly, during the training phase using normal samples, the method models the distribution of skeleton features of the normal actions while freezing the weights of the DNNs and estimates the anomaly score using this distribution in the inference phase. Additionally, to increase robustness against skeleton errors, we introduce a DNN architecture inspired by a point cloud deep learning paradigm, which sparsely propagates the features between joints. Furthermore, to prevent the unobserved normal actions from being misidentified as abnormal actions, we incorporate a similarity score between the user prompt embeddings and skeleton features aligned in the common space into the anomaly score, which indirectly supplements normal actions. On two publicly available datasets, we conduct experiments to test the effectiveness of the proposed method with respect to abovementioned limitations.

updated: Mon Mar 27 2023 12:59:33 GMT+0000 (UTC)

published: Mon Mar 27 2023 12:59:33 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト