Adaptive Feature Fusion Network for Gaze Tracking in Mobile Tablets

Yiwei Bao; Yihua Cheng; Yunfei Liu; Feng Lu

モバイルタブレットの視線追跡のための適応機能融合ネットワーク

最近、多くのマルチストリーム視線推定方法が提案されている。彼らは目と顔の外観から視線を推定し、妥当な精度を達成します。ただし、ほとんどの方法は、目と顔の外観から抽出された特徴を単純に連結します。機能融合プロセスは無視されました。本論文では、モバイルタブレットで視線追跡タスクを実行する新しい適応機能融合ネットワーク（AFF-Net）を提案します。 2眼の特徴マップを積み重ね、Squeeze-and-Excitationレイヤーを利用して、外観の類似性に応じて2眼の特徴を適応的に融合します。一方、顔の特徴のガイダンスで目の特徴を再較正するための適応グループ正規化も提案します。 GazeCaptureとMPIIFaceGazeデータセットの両方での広範な実験は、提案された方法の一貫して優れたパフォーマンスを示しています。

Recently, many multi-stream gaze estimation methods have been proposed. They estimate gaze from eye and face appearances and achieve reasonable accuracy. However, most of the methods simply concatenate the features extracted from eye and face appearance. The feature fusion process has been ignored. In this paper, we propose a novel Adaptive Feature Fusion Network (AFF-Net), which performs gaze tracking task in mobile tablets. We stack two-eye feature maps and utilize Squeeze-and-Excitation layers to adaptively fuse two-eye features according to their similarity on appearance. Meanwhile, we also propose Adaptive Group Normalization to recalibrate eye features with the guidance of facial feature. Extensive experiments on both GazeCapture and MPIIFaceGaze datasets demonstrate consistently superior performance of the proposed method.

updated: Sat Mar 20 2021 07:16:10 GMT+0000 (UTC)

published: Sat Mar 20 2021 07:16:10 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト