Computerized cognitive training (CCT) is a scalable, well-tolerated intervention that has promise for slowing cognitive decline. Outcomes from CCT are limited by a lack of effective engagement, which is decreased by factors such as mental fatigue, particularly in older adults at risk for dementia. There is a need for scalable, automated measures that can monitor mental fatigue during CCT. Here, we develop and validate a novel Recurrent Video Transformer (RVT) method for monitoring real-time mental fatigue in older adults with mild cognitive impairment from video-recorded facial gestures during CCT. The RVT model achieved the highest balanced accuracy(78%) and precision (0.82) compared to the prior state-of-the-art models for binary and multi-class classification of mental fatigue and was additionally validated via significant association (p=0.023) with CCT reaction time. By leveraging dynamic temporal information, the RVT model demonstrates the potential to accurately measure real-time mental fatigue, laying the foundation for future personalized CCT that increase effective engagement.