MixAugment & Mixup: Augmentation Methods for Facial Expression Recognition

Andreas Psaroudakis; Dimitrios Kollias

MixAugment＆Mixup：顔の表情を認識するための拡張方法

自動顔の表情認識（FER）は、顔の表情が人間のコミュニケーションの中心的な役割を果たして以来、過去20年間でますます注目を集めています。ほとんどのFER手法では、データ分析に関して強力なツールであるディープニューラルネットワーク（DNN）を利用しています。ただし、これらのネットワークは、その能力にもかかわらず、トレーニングデータを記憶する傾向があるため、過剰適合する傾向があります。さらに、現在、FER用の大規模なデータベースは多くありません（つまり、制約のない環境にあります）。この問題を軽減するために、多くのデータ拡張手法が提案されています。データ拡張は、元のデータに制約付き変換を適用することにより、利用可能なデータの多様性を高める方法です。さまざまな分類タスクに積極的に貢献しているそのような手法の1つに、Mixupがあります。これによると、DNNは、例のペアとそれに対応するラベルの凸結合でトレーニングされます。この論文では、データが頭のポーズ、照明条件、背景、コンテキストに大きな変動がある野生のFERに対するMixupの有効性を調べます。次に、Mixupに基づくMixAugmentと呼ばれる新しいデータ拡張戦略を提案します。これによると、ネットワークは仮想の例と実際の例の組み合わせで同時にトレーニングされます。これらの例はすべて、全体的な損失関数に寄与します。 Mixupおよびさまざまな最先端の方法に対するMixAugmentの有効性を証明する広範な実験的研究を実施します。さらに、ドロップアウトとMixupおよびMixAugmentの組み合わせ、および他のデータ拡張手法とMixAugmentの組み合わせを調査します。

Automatic Facial Expression Recognition (FER) has attracted increasing attention in the last 20 years since facial expressions play a central role in human communication. Most FER methodologies utilize Deep Neural Networks (DNNs) that are powerful tools when it comes to data analysis. However, despite their power, these networks are prone to overfitting, as they often tend to memorize the training data. What is more, there are not currently a lot of in-the-wild (i.e. in unconstrained environment) large databases for FER. To alleviate this issue, a number of data augmentation techniques have been proposed. Data augmentation is a way to increase the diversity of available data by applying constrained transformations on the original data. One such technique, which has positively contributed to various classification tasks, is Mixup. According to this, a DNN is trained on convex combinations of pairs of examples and their corresponding labels. In this paper, we examine the effectiveness of Mixup for in-the-wild FER in which data have large variations in head poses, illumination conditions, backgrounds and contexts. We then propose a new data augmentation strategy which is based on Mixup, called MixAugment. According to this, the network is trained concurrently on a combination of virtual examples and real examples; all these examples contribute to the overall loss function. We conduct an extensive experimental study that proves the effectiveness of MixAugment over Mixup and various state-of-the-art methods. We further investigate the combination of dropout with Mixup and MixAugment, as well as the combination of other data augmentation techniques with MixAugment.

updated: Mon May 09 2022 17:43:08 GMT+0000 (UTC)

published: Mon May 09 2022 17:43:08 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト