Learning Motion-Robust Remote Photoplethysmography through Arbitrary Resolution Videos

Jianwei Li; Zitong Yu; Jingang Shi

任意の解像度のビデオによるモーションロバストなリモートフォトプレチスモグラフィーの学習

リモートフォトプレチスモグラフィ (rPPG) は、顔のビデオからの非接触心拍数 (HR) 推定を可能にし、従来の接触ベースの測定と比較して非常に便利です。現実世界の長期的なヘルスモニタリングシナリオでは、参加者の距離と頭の動きは通常、時間によって変化し、顔の解像度の変化と複雑なモーションアーティファクトにより、rPPG 測定が不正確になります。カメラと参加者の間の距離が一定になるように設計された以前の rPPG モデルとは異なり、この論文では、2 つのプラグアンドプレイブロック (すなわち、生理学的信号特徴抽出ブロック (PFE) と時間的顔アライメントブロック (TFA)) を提案します。飛距離やヘッドモーションの変化による劣化を軽減。一方では、代表領域情報に導かれて、PFE は任意の解像度の顔のフレームを固定解像度の顔の構造の特徴に適応的にエンコードします。一方、推定されたオプティカルフローを活用することで、TFA は頭の動きによって引き起こされる rPPG 信号の混乱を打ち消すことができるため、モーションロバストな rPPG 信号の回復に役立ちます。さらに、2 ストリームのデュアル解像度フレームワークを使用してクロス解像度制約を使用してモデルをトレーニングすることで、PFE が解像度に強い顔の rPPG 機能を学習するのにさらに役立ちます。 3 つのベンチマークデータセット (UBFC-rPPG、COHFACE、および PURE) での広範な実験により、提案された方法の優れたパフォーマンスが実証されています。ハイライトの 1 つは、PFE と TFA を使用すると、市販の時空間 rPPG モデルが、さまざまな顔の解像度と深刻な頭の動きのシナリオの両方で、より堅牢な rPPG 信号を予測できることです。コードは https://github.com/LJW-GIT/Arbitrary_Resolution_rPPG で入手できます。

Remote photoplethysmography (rPPG) enables non-contact heart rate (HR) estimation from facial videos which gives significant convenience compared with traditional contact-based measurements. In the real-world long-term health monitoring scenario, the distance of the participants and their head movements usually vary by time, resulting in the inaccurate rPPG measurement due to the varying face resolution and complex motion artifacts. Different from the previous rPPG models designed for a constant distance between camera and participants, in this paper, we propose two plug-and-play blocks (i.e., physiological signal feature extraction block (PFE) and temporal face alignment block (TFA)) to alleviate the degradation of changing distance and head motion. On one side, guided with representative-area information, PFE adaptively encodes the arbitrary resolution facial frames to the fixed-resolution facial structure features. On the other side, leveraging the estimated optical flow, TFA is able to counteract the rPPG signal confusion caused by the head movement thus benefit the motion-robust rPPG signal recovery. Besides, we also train the model with a cross-resolution constraint using a two-stream dual-resolution framework, which further helps PFE learn resolution-robust facial rPPG features. Extensive experiments on three benchmark datasets (UBFC-rPPG, COHFACE and PURE) demonstrate the superior performance of the proposed method. One highlight is that with PFE and TFA, the off-the-shelf spatio-temporal rPPG models can predict more robust rPPG signals under both varying face resolution and severe head movement scenarios. The codes are available at https://github.com/LJW-GIT/Arbitrary_Resolution_rPPG.

updated: Fri Dec 02 2022 19:40:26 GMT+0000 (UTC)

published: Wed Nov 30 2022 11:50:08 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト