Unsupervised Segmentation of Action Segments in Egocentric Videos using Gaze

I. Hipiny; H. Ujir; J. L. Minoi; S. F. Samson Juan; M. A. Khairuddin; M. S. Sunar

Gazeを使用した自己中心的なビデオのアクションセグメントの教師なしセグメンテーション

自己中心的なビデオのアクションセグメントの教師なしセグメンテーションは、アクティビティ認識やコンテンツベースのビデオ検索などのタスクで望ましい機能です。サーチスペースをアクションセグメントの有限セットに減らすと、マッチングがより速く、ノイズが少なくなります。ただし、継続的な人間の活動中の自然な時間的カットの機械の理解にはかなりのギャップがあります。この作品は、自己中心的なカメラを使用してキャプチャされたビデオのアクションセグメントをセグメント化するための新しい視線ベースのアプローチについて報告します。視線は、フレーム内の関心領域を見つけるために使用されます。連続する関心領域内の2つの単純なモーションベースのパラメータを追跡することにより、時間的カットの有限セットを発見します。データセットの（2つのパラメーターの）組み合わせを使用したいくつかの結果、つまりBRISGAZE-ACTIONSを示します。データセットには、いくつかの日常生活動作を描いた自己中心的なビデオが含まれています。 2つのエントロピー測定を実装することにより、一時的なカットの品質がさらに向上します。

Unsupervised segmentation of action segments in egocentric videos is a desirable feature in tasks such as activity recognition and content-based video retrieval. Reducing the search space into a finite set of action segments facilitates a faster and less noisy matching. However, there exist a substantial gap in machine understanding of natural temporal cuts during a continuous human activity. This work reports on a novel gaze-based approach for segmenting action segments in videos captured using an egocentric camera. Gaze is used to locate the region-of-interest inside a frame. By tracking two simple motion-based parameters inside successive regions-of-interest, we discover a finite set of temporal cuts. We present several results using combinations (of the two parameters) on a dataset, i.e., BRISGAZE-ACTIONS. The dataset contains egocentric videos depicting several daily-living activities. The quality of the temporal cuts is further improved by implementing two entropy measures.

updated: Wed Jun 23 2021 10:46:10 GMT+0000 (UTC)

published: Sat Sep 30 2017 12:19:41 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト