Learning Unseen Emotions from Gestures via Semantically-Conditioned Zero-Shot Perception with Adversarial Autoencoders

Abhishek Banerjee; Uttaran Bhattacharya; Aniket Bera

敵対的な自動エンコーダを備えた意味論的に条件付けされたゼロショット知覚を介してジェスチャから目に見えない感情を学習する

ジェスチャーから知覚される感情を認識するための新しい一般化されたゼロショットアルゴリズムを紹介します。私たちの仕事は、ジェスチャーをトレーニングで出会ったことのない新しい感情カテゴリーにマッピングすることです。 3Dモーションキャプチャジェスチャシーケンスを、word2vec埋め込みを使用して自然言語で知覚される感情用語のベクトル化された表現と相関させる、敵対的なオートエンコーダベースの表現学習を紹介します。言語セマンティック埋め込みは、感情ラベル空間の表現を提供します。この基礎となる分布を利用して、ジェスチャーシーケンスを適切なカテゴリカル感情ラベルにマップします。既知の感情用語で注釈が付けられたジェスチャーと、感情で注釈が付けられていないジェスチャーの組み合わせを使用して、メソッドをトレーニングします。 MPI Emotional Body Expressions Database（EBEDB）でメソッドを評価し、58.43％の精度を取得します。これにより、一般化されたゼロショット学習の最新のアルゴリズムのパフォーマンスが絶対値で25--27％向上します。

We present a novel generalized zero-shot algorithm to recognize perceived emotions from gestures. Our task is to map gestures to novel emotion categories not encountered in training. We introduce an adversarial, autoencoder-based representation learning that correlates 3D motion-captured gesture sequence with the vectorized representation of the natural-language perceived emotion terms using word2vec embeddings. The language-semantic embedding provides a representation of the emotion label space, and we leverage this underlying distribution to map the gesture-sequences to the appropriate categorical emotion labels. We train our method using a combination of gestures annotated with known emotion terms and gestures not annotated with any emotions. We evaluate our method on the MPI Emotional Body Expressions Database (EBEDB) and obtain an accuracy of 58.43%. This improves the performance of current state-of-the-art algorithms for generalized zero-shot learning by 25--27% on the absolute.

updated: Thu Dec 02 2021 08:16:02 GMT+0000 (UTC)

published: Fri Sep 18 2020 15:59:44 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト