Complex Facial Expression Recognition Using Deep Knowledge Distillation of Basic Features

Angus Maiden; Bahareh Nakisa

基本的な特徴の抽出による深い知識を使用した複雑な表情認識

複雑な感情認識は、これまでのところ、人間の認知レベル以上の他のタスクと同様の優れたパフォーマンスを実現できていない認知タスクです。人間の顔によって表現される感情は複雑であるため、顔の表情による感情認識は特に困難です。機械がこの領域で人間と同じレベルのパフォーマンスに近づくには、人間と同じように知識を統合し、新しい概念をリアルタイムで理解する必要があるかもしれません。人間は、記憶から重要な情報を抽出し、残りを破棄することにより、わずかな例だけを使用して新しい概念を学習することができます。同様に、継続学習メソッドは既知のクラスの知識を保持しながら新しいクラスを学習しますが、少数ショット学習メソッドは非常に少ないトレーニング例を使用して新しいクラスを学習できます。我々は、人間の認知と学習にヒントを得て、基本的な表現クラスの知識を構築し維持することで、少ないトレーニングサンプルで新しい複合表現クラスを正確に認識できる、新しい継続的学習手法を提案します。 GradCAM 視覚化を使用して、基本的な顔の表情と複合的な表情の関係を実証します。この関係は、知識の蒸留と新しい予測並べ替えメモリリプレイを通じて私たちの方法で活用されます。私たちのメソッドは、新しいクラスで 74.28% の総合精度を備えた複雑な表情認識の継続学習における現在の最先端を達成しています。また、複雑な表情認識に継続学習を使用すると、非継続学習方法よりもはるかに優れたパフォーマンスが達成され、最先端の非継続学習方法より 13.95% 向上することも実証しました。私たちの知る限り、私たちの研究は、数ショット学習を複雑な表情認識に初めて適用したものでもあり、表情クラスごとに 1 つのトレーニングサンプルを使用して 100% の精度で最先端の精度を達成しました。

Complex emotion recognition is a cognitive task that has so far eluded the same excellent performance of other tasks that are at or above the level of human cognition. Emotion recognition through facial expressions is particularly difficult due to the complexity of emotions expressed by the human face. For a machine to approach the same level of performance in this domain as a human, it may need to synthesise knowledge and understand new concepts in real-time as humans do. Humans are able to learn new concepts using only few examples, by distilling the important information from memories and discarding the rest. Similarly, continual learning methods learn new classes whilst retaining the knowledge of known classes, whilst few-shot learning methods are able to learn new classes using very few training examples. We propose a novel continual learning method inspired by human cognition and learning that can accurately recognise new compound expression classes using few training samples, by building on and retaining its knowledge of basic expression classes. Using GradCAM visualisations, we demonstrate the relationship between basic and compound facial expressions, which our method leverages through knowledge distillation and a novel Predictive Sorting Memory Replay. Our method achieves the current state-of-the-art in continual learning for complex facial expression recognition with 74.28% Overall Accuracy on new classes. We also demonstrate that using continual learning for complex facial expression recognition achieves far better performance than non-continual learning methods, improving on state-of-the-art non-continual learning methods by 13.95%. To the best of our knowledge, our work is also the first to apply few-shot learning to complex facial expression recognition, achieving the state-of-the-art with 100% accuracy using a single training sample for each expression class.

updated: Fri Aug 11 2023 15:42:48 GMT+0000 (UTC)

published: Fri Aug 11 2023 15:42:48 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト