Real-time Multi-person Eyeblink Detection in the Wild for Untrimmed Video

Wenzheng Zeng; Yang Xiao; Sicheng Wei; Jinfang Gan; Xintao Zhang; Zhiguo Cao; Zhiwen Fang; Joey Tianyi Zhou

トリミングされていないビデオの野生でのリアルタイムの複数人のまばたき検出

実際のリアルタイムのまばたき検出は、疲労検出、顔のなりすまし防止、感情分析などに広く役立ちます。既存の研究努力は、一般に、トリミングされたビデオに向けた 1 人のケースに焦点を当てています。ただし、トリミングされていないビデオ内の複数の人物のシナリオも、実用的なアプリケーションにとって重要であり、まだ十分に懸念されていません。これに対処するために、データセット、理論、および実践に関する重要な貢献により、この研究分野に初めて光を当てました。特に、8748 のまばたきイベントを含む 686 のトリミングされていないビデオを含む MPEblink と呼ばれる大規模なデータセットが複数人の条件下で提案されています。サンプルは制約のないフィルムからキャプチャされ、「野生の」特性を明らかにします。一方、リアルタイムの複数人のまばたき検出方法も提案されています。既存のカウンターパートとは異なり、私たちの提案は、エンドツーエンドの学習能力を備えた 1 段階の時空間的な方法で実行されます。具体的には、顔検出、顔追跡、および人間インスタンスレベルの瞬き検出のサブタスクに同時に対処します。このパラダイムには 2 つの主な利点があります。(1) 顔のグローバルコンテキスト (頭のポーズや照明条件など) と共同の最適化と相互作用を介してまばたきの機能を促進できること、および (2) これらのサブタスクを順次ではなく並行して処理できることです。リアルタイム実行の要件を満たすために時間を大幅に節約できます。 MPEblink の実験では、トリミングされていないビデオを対象に、複数人のリアルタイムのまばたき検出の本質的な課題を確認しています。また、私たちの方法は、既存のアプローチよりも大幅に優れており、推論速度も高速です。

Real-time eyeblink detection in the wild can widely serve for fatigue detection, face anti-spoofing, emotion analysis, etc. The existing research efforts generally focus on single-person cases towards trimmed video. However, multi-person scenario within untrimmed videos is also important for practical applications, which has not been well concerned yet. To address this, we shed light on this research field for the first time with essential contributions on dataset, theory, and practices. In particular, a large-scale dataset termed MPEblink that involves 686 untrimmed videos with 8748 eyeblink events is proposed under multi-person conditions. The samples are captured from unconstrained films to reveal "in the wild" characteristics. Meanwhile, a real-time multi-person eyeblink detection method is also proposed. Being different from the existing counterparts, our proposition runs in a one-stage spatio-temporal way with end-to-end learning capacity. Specifically, it simultaneously addresses the sub-tasks of face detection, face tracking, and human instance-level eyeblink detection. This paradigm holds 2 main advantages: (1) eyeblink features can be facilitated via the face's global context (e.g., head pose and illumination condition) with joint optimization and interaction, and (2) addressing these sub-tasks in parallel instead of sequential manner can save time remarkably to meet the real-time running requirement. Experiments on MPEblink verify the essential challenges of real-time multi-person eyeblink detection in the wild for untrimmed video. Our method also outperforms existing approaches by large margins and with a high inference speed.

updated: Mon Aug 21 2023 14:18:55 GMT+0000 (UTC)

published: Tue Mar 28 2023 15:35:25 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト