Hybrid Classifiers for Spatio-temporal Real-time Abnormal Behaviors Detection, Tracking, and Recognition in Massive Hajj Crowds

Tarik Alafif; Anas Hadi; Manal Allahyani; Bander Alzahrani; Areej Alhothali; Reem Alotaibi; Ahmed Barnawi

大規模なメッカ巡礼群集における時空間リアルタイム異常行動の検出、追跡、および認識のためのハイブリッド分類器

個々の異常な行動は、群衆のサイズ、コンテキスト、およびシーンによって異なります。部分的な閉塞、ぼやけ、多数の異常な行動、カメラの表示などの課題は、異常な行動をしている個人を検出、追跡、および認識するときに大規模な群集で発生します。この論文では、私たちの貢献は2つあります。最初に、注釈とラベルが付けられた大規模な群衆の異常行動Hajjデータセット（HAJJv2）を紹介します。次に、ハイブリッド畳み込みニューラルネットワーク（CNN）とランダムフォレスト（RF）の2つの方法を提案して、小規模および大規模な群集ビデオの時空間異常動作を検出および認識します。小規模な群集ビデオでは、ResNet-50の事前トレーニング済みCNNモデルが微調整され、すべてのフレームが空間領域で正常か異常かを検証します。異常な行動が観察された場合、ホーンシュンクオプティカルフローの大きさと方向に基づくモーションベースの個人検出方法を使用して、異常な行動をしている個人を特定して追跡します。カルマンフィルターは、後続のフレームで検出された個人を予測および追跡するために、大規模な群集ビデオで採用されています。次に、平均、分散、および標準偏差の統計的特徴が計算され、RFに供給されて、時間領域で異常な行動をしている個人が分類されます。大規模な群集では、YOLOv2オブジェクト検出技術を使用してResNet-50モデルを微調整し、空間領域で異常な行動をしている個人を検出します。

Individual abnormal behaviors vary depending on crowd sizes, contexts, and scenes. Challenges such as partial occlusions, blurring, large-number abnormal behavior, and camera viewing occur in large-scale crowds when detecting, tracking, and recognizing individuals with abnormal behaviors. In this paper, our contribution is twofold. First, we introduce an annotated and labeled large-scale crowd abnormal behaviors Hajj dataset (HAJJv2). Second, we propose two methods of hybrid Convolutional Neural Networks (CNNs) and Random Forests (RFs) to detect and recognize Spatio-temporal abnormal behaviors in small and large-scales crowd videos. In small-scale crowd videos, a ResNet-50 pre-trained CNN model is fine-tuned to verify whether every frame is normal or abnormal in the spatial domain. If anomalous behaviors are observed, a motion-based individuals detection method based on the magnitudes and orientations of Horn-Schunck optical flow is used to locate and track individuals with abnormal behaviors. A Kalman filter is employed in large-scale crowd videos to predict and track the detected individuals in the subsequent frames. Then, means, variances, and standard deviations statistical features are computed and fed to the RF to classify individuals with abnormal behaviors in the temporal domain. In large-scale crowds, we fine-tune the ResNet-50 model using YOLOv2 object detection technique to detect individuals with abnormal behaviors in the spatial domain.

updated: Mon Jul 25 2022 06:52:55 GMT+0000 (UTC)

published: Mon Jul 25 2022 06:52:55 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト