Facial Action Unit Detection and Intensity Estimation from Self-supervised Representation

Bowen Ma; Rudong An; Wei Zhang; Yu Ding; Zeng Zhao; Rongsheng Zhang; Tangjie Lv; Changjie Fan; Zhipeng Hu

自己教師あり表現からの顔の動作単位の検出と強度の推定

きめの細かい局所的な表情行動測定として、顔のアクションユニット (FAU) 分析 (たとえば、検出と強度の推定) は、時間がかかり、労働集約的で、エラーが発生しやすい注釈について文書化されています。したがって、FAU 分析の長年の課題は、手動注釈のデータ不足から生じ、トレーニング済みモデルの一般化能力を大幅に制限します。これまでの多くの研究では、半/弱教師付きメソッドと追加の補助情報を使用して、この問題を軽減するための努力が行われてきました。ただし、これらの方法には依然としてドメインの知識が必要であり、データ注釈への高い依存性をまだ回避できていません。この論文では、AU 分析のための堅牢な顔表現モデル MAE-Face を紹介します。 MAE-Face は、自己教師ありの事前トレーニングアプローチとしてマスクされた自動エンコードを使用して、追加のデータアノテーションなしで、実行可能な顔画像のコレクションから大容量モデルを最初に学習します。次に、AU データセットで微調整された後、MAE-Face は AU 検出と AU 強度推定の両方で説得力のあるパフォーマンスを示し、ほぼすべての評価結果で新しい最先端技術を達成しました。さらに調査すると、AU トレーニングセットのわずか 1% で微調整した場合でも、MAE-Face はまともなパフォーマンスを達成し、そのロバスト性と汎化パフォーマンスが強力に証明されています。

As a fine-grained and local expression behavior measurement, facial action unit (FAU) analysis (e.g., detection and intensity estimation) has been documented for its time-consuming, labor-intensive, and error-prone annotation. Thus a long-standing challenge of FAU analysis arises from the data scarcity of manual annotations, limiting the generalization ability of trained models to a large extent. Amounts of previous works have made efforts to alleviate this issue via semi/weakly supervised methods and extra auxiliary information. However, these methods still require domain knowledge and have not yet avoided the high dependency on data annotation. This paper introduces a robust facial representation model MAE-Face for AU analysis. Using masked autoencoding as the self-supervised pre-training approach, MAE-Face first learns a high-capacity model from a feasible collection of face images without additional data annotations. Then after being fine-tuned on AU datasets, MAE-Face exhibits convincing performance for both AU detection and AU intensity estimation, achieving a new state-of-the-art on nearly all the evaluation results. Further investigation shows that MAE-Face achieves decent performance even when fine-tuned on only 1% of the AU training set, strongly proving its robustness and generalization performance.

updated: Fri Oct 28 2022 03:55:09 GMT+0000 (UTC)

published: Fri Oct 28 2022 03:55:09 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト