Multi-scale multi-modal micro-expression recognition algorithm based on transformer

Fengping Wang; Jie Li; Chun Qi; Lin Wang; Pan Wang

トランスフォーマーに基づくマルチスケール・マルチモーダル微表情認識アルゴリズム

マイクロエクスプレッションは、人々が隠そうとする本当の感情を明らかにすることができる自発的な無意識の顔面筋肉の動きです.手作業による方法は順調に進歩しており、ディープラーニングが注目を集めています。マイクロ表現の持続時間が短く、顔領域で表現されるスケールが異なるため、既存のアルゴリズムは、コンテキスト情報を考慮して基礎となる特徴を学習しながら、マルチモーダルマルチスケール顔領域特徴を抽出できません。したがって、上記の問題を解決するために、トランスネットワークに基づくマルチモーダルマルチスケールアルゴリズムがこの論文で提案され、マイクロ表現の2つのモード機能を通じてマイクロ表現のローカルマルチグレイン機能を完全に学習することを目指しています-動きの特徴と質感の特徴。異なるスケールで顔の局所領域の特徴を取得するために、両方のモダリティの異なるスケールでパッチの特徴を学習し、パッチの特徴を重み付けすることで効果的な特徴を取得するために多層多頭注意重みを融合し、クロスモーダルのコントラストを組み合わせました。モデル最適化のための学習。 3つの自発的なデータセットに対して包括的な実験を行い、結果は、単一測定SMICデータベースで提案されたアルゴリズムの精度が最大78.73％であり、結合されたデータベースのCASMEIIのF1値が最大0.9071であることを示しています。これは主要なレベルです.

A micro-expression is a spontaneous unconscious facial muscle movement that can reveal the true emotions people attempt to hide. Although manual methods have made good progress and deep learning is gaining prominence. Due to the short duration of micro-expression and different scales of expressed in facial regions, existing algorithms cannot extract multi-modal multi-scale facial region features while taking into account contextual information to learn underlying features. Therefore, in order to solve the above problems, a multi-modal multi-scale algorithm based on transformer network is proposed in this paper, aiming to fully learn local multi-grained features of micro-expressions through two modal features of micro-expressions - motion features and texture features. To obtain local area features of the face at different scales, we learned patch features at different scales for both modalities, and then fused multi-layer multi-headed attention weights to obtain effective features by weighting the patch features, and combined cross-modal contrastive learning for model optimization. We conducted comprehensive experiments on three spontaneous datasets, and the results show the accuracy of the proposed algorithm in single measurement SMIC database is up to 78.73% and the F1 value on CASMEII of the combined database is up to 0.9071, which is at the leading level.

updated: Wed Jan 11 2023 03:04:42 GMT+0000 (UTC)

published: Sun Jan 08 2023 03:45:23 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト