We propose a condition-adaptive representation learning framework for the driver drowsiness detection based on 3D-deep convolutional neural network. The proposed framework consists of four models: spatio-temporal representation learning, scene condition understanding, feature fusion, and drowsiness detection. The spatio-temporal representation learning extracts features that can describe motions and appearances in video simultaneously. The scene condition understanding classifies the scene conditions related to various conditions about the drivers and driving situations such as statuses of wearing glasses, illumination condition of driving, and motion of facial elements such as head, eye, and mouth. The feature fusion generates a condition-adaptive representation using two features extracted from above models. The detection model recognizes drivers drowsiness status using the condition-adaptive representation. The condition-adaptive representation learning framework can extract more discriminative features focusing on each scene condition than the general representation so that the drowsiness detection method can provide more accurate results for the various driving situations. The proposed framework is evaluated with the NTHU Drowsy Driver Detection video dataset. The experimental results show that our framework outperforms the existing drowsiness detection methods based on visual analysis.
updated: Tue Oct 22 2019 01:51:43 GMT+0000 (UTC)
published: Tue Oct 22 2019 01:51:43 GMT+0000 (UTC)