Instantaneous Physiological Estimation using Video Transformers

Ambareesh Revanur; Ananyananda Dasari; Conrad S. Tucker; Laszlo A. Jeni

ビデオトランスフォーマーを使用した瞬時の生理学的推定

ビデオベースの生理学的信号推定は、主にウィンドウ間隔での一時的なスコアの予測に限定されています。これらの断続的な値は有用ですが、患者の生理学的状態の不完全な全体像を提供し、重大な状態の検出が遅れる可能性があります。顔のビデオから瞬間心拍数と呼吸数を推定するためのビデオトランスフォーマーを提案します。生理学的信号は、通常、空間と時間のアライメントエラーによって混乱します。これを克服するために、周波数領域で損失を定式化しました。大規模なVision-for-Vitals（V4V）ベンチマークでこの方法を評価しました。それは、瞬間呼吸数推定のための浅い学習と深い学習に基づく方法の両方を上回りました。心拍数の推定の場合、1分あたり13.0ビートの瞬間MAEを達成しました。

Video-based physiological signal estimation has been limited primarily to predicting episodic scores in windowed intervals. While these intermittent values are useful, they provide an incomplete picture of patients' physiological status and may lead to late detection of critical conditions. We propose a video Transformer for estimating instantaneous heart rate and respiration rate from face videos. Physiological signals are typically confounded by alignment errors in space and time. To overcome this, we formulated the loss in the frequency domain. We evaluated the method on the large scale Vision-for-Vitals (V4V) benchmark. It outperformed both shallow and deep learning based methods for instantaneous respiration rate estimation. In the case of heart-rate estimation, it achieved an instantaneous-MAE of 13.0 beats-per-minute.

updated: Thu Feb 24 2022 21:25:09 GMT+0000 (UTC)

published: Thu Feb 24 2022 21:25:09 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト