Speech, Head, and Eye-based Cues for Continuous Affect Prediction

Jonny O'Dwyer

継続的な影響予測のための音声、頭、および目のベースのキュー

継続的な影響の予測には、影響ディメンションの離散的な時間連続回帰が含まれます。予測されるディメンションには、多くの場合、覚醒と価が含まれます。継続的影響予測研究者は現在、マルチモーダルモデルの入力を受け入れています。これは、研究者が以前に未踏の感情的合図を調査する動機を提供します。音声ベースのキューは、従来、影響の予測で最も注目されてきましたが、非言語入力は、感情的なコンピューティングシステムのパフォーマンスを向上させる可能性が高く、さらに、音声がない場合の影響モデリングを可能にします。ただし、継続的な感情予測にほとんど注意を払っていない非言語入力には、目と頭に基づくキューが含まれます。目は感情の表示と知覚に関与していますが、頭部ベースのキューは感情の伝達と知覚に寄与することが示されています。さらに、これらのキューは、最新のコンピュータービジョンツールを使用して、ビデオから非侵襲的に推定できます。この作業では、頭と目をベースにした機能と、継続的な影響予測のための音声との組み合わせを包括的に調査することにより、このギャップを活用しています。これらのモダリティから手作りされ、自動的に生成され、CNNで学習された機能は、継続的な影響予測のために調査されます。最高の機能セットと機能セットの組み合わせは、これらの機能が個人の感情状態の予測にどれほど効果的かを答えます。

Continuous affect prediction involves the discrete time-continuous regression of affect dimensions. Dimensions to be predicted often include arousal and valence. Continuous affect prediction researchers are now embracing multimodal model input. This provides motivation for researchers to investigate previously unexplored affective cues. Speech-based cues have traditionally received the most attention for affect prediction, however, non-verbal inputs have significant potential to increase the performance of affective computing systems and in addition, allow affect modelling in the absence of speech. However, non-verbal inputs that have received little attention for continuous affect prediction include eye and head-based cues. The eyes are involved in emotion displays and perception while head-based cues have been shown to contribute to emotion conveyance and perception. Additionally, these cues can be estimated non-invasively from video, using modern computer vision tools. This work exploits this gap by comprehensively investigating head and eye-based features and their combination with speech for continuous affect prediction. Hand-crafted, automatically generated and CNN-learned features from these modalities will be investigated for continuous affect prediction. The highest performing feature sets and feature set combinations will answer how effective these features are for the prediction of an individual's affective state.

updated: Thu Jan 23 2020 15:59:13 GMT+0000 (UTC)

published: Tue Jul 23 2019 14:46:51 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト