Normalizing Flows for Human Pose Anomaly Detection

Or Hirschorn; Shai Avidan

人間の姿勢異常検出のためのフローの正規化

ビデオの異常検出は、外観、ポーズ、カメラの角度、背景などの多くのパラメーターに依存するため、不適切な問題が発生します。私たちは問題を人間のポーズの異常検出に絞り込み、外観などの迷惑パラメータが結果に影響を与えるリスクを軽減します。ポーズのみに焦点を当てることには、明確な少数派に対する偏見を減らすという副次的な利点もあります。私たちのモデルは人間のポーズグラフシーケンスで直接動作し、非常に軽量 (~1K パラメータ) で、追加リソースを無視してポーズ推定を実行できるマシン上で実行できます。正規化フローフレームワークで非常にコンパクトなポーズ表現を活用し、これを拡張して時空間ポーズデータの固有の特性に取り組み、このユースケースでの利点を示します。このアルゴリズムは非常に汎用的で、正常な例のみのトレーニングデータと、ラベル付けされた正常な例と異常な例で構成される教師あり設定を処理できます。私たちは、教師なし ShanghaiTech データセットと最近の教師あり UBnormal データセットという 2 つの異常検出ベンチマークに関する最先端の結果を報告します。

Video anomaly detection is an ill-posed problem because it relies on many parameters such as appearance, pose, camera angle, background, and more. We distill the problem to anomaly detection of human pose, thus decreasing the risk of nuisance parameters such as appearance affecting the result. Focusing on pose alone also has the side benefit of reducing bias against distinct minority groups. Our model works directly on human pose graph sequences and is exceptionally lightweight (~1K parameters), capable of running on any machine able to run the pose estimation with negligible additional resources. We leverage the highly compact pose representation in a normalizing flows framework, which we extend to tackle the unique characteristics of spatio-temporal pose data and show its advantages in this use case. The algorithm is quite general and can handle training data of only normal examples as well as a supervised setting that consists of labeled normal and abnormal examples. We report state-of-the-art results on two anomaly detection benchmarks - the unsupervised ShanghaiTech dataset and the recent supervised UBnormal dataset.

updated: Wed Aug 16 2023 17:54:55 GMT+0000 (UTC)

published: Sun Nov 20 2022 11:02:50 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト