Progressive Spatio-Temporal Bilinear Network with Monte Carlo Dropout for Landmark-based Facial Expression Recognition with Uncertainty Estimation

Negar Heidari; Alexandros Iosifidis

不確実性推定によるランドマークベースの表情認識のためのモンテカルロドロップアウトを使用したプログレッシブ時空間バイリニアネットワーク

ディープニューラルネットワークは、表情認識システムの特徴学習に広く使用されています。ただし、小さなデータセットと大きなクラス内変動は、過適合につながる可能性があります。この論文では、ローカライズされた顔のランドマーク特徴を利用してリアルタイムの表情認識のために最適化されたコンパクトなネットワークトポロジを学習する方法を提案します。私たちの方法は、顔の表情の実行中に顔のランドマークの動きを効果的にキャプチャするバックボーンとして時空間バイリニアレイヤーを採用しています。さらに、モンテカルロドロップアウトを利用して、モデルの不確実性をキャプチャします。これは、不確実なケースを分析および処理するために非常に重要です。私たちの方法のパフォーマンスは、広く使用されている 3 つのデータセットで評価され、ビデオベースの最先端の方法に匹敵するほど複雑ではありません。

Deep neural networks have been widely used for feature learning in facial expression recognition systems. However, small datasets and large intra-class variability can lead to overfitting. In this paper, we propose a method which learns an optimized compact network topology for real-time facial expression recognition utilizing localized facial landmark features. Our method employs a spatio-temporal bilinear layer as backbone to capture the motion of facial landmarks during the execution of a facial expression effectively. Besides, it takes advantage of Monte Carlo Dropout to capture the model's uncertainty which is of great importance to analyze and treat uncertain cases. The performance of our method is evaluated on three widely used datasets and it is comparable to that of video-based state-of-the-art methods while it has much less complexity.

updated: Tue Jun 08 2021 13:40:30 GMT+0000 (UTC)

published: Tue Jun 08 2021 13:40:30 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト