Real-Time Human Pose Estimation on a Smart Walker using Convolutional Neural Networks

Manuel Palermo; Sara Moccia; Lucia Migliorelli; Emanuele Frontoni; Cristina P. Santos

畳み込みニューラルネットワークを使用したスマートウォーカーでのリアルタイムの人間の姿勢推定

リハビリテーションは、運動障害のある患者の生活の質を改善するために重要です。スマートウォーカーは一般的に使用されるソリューションであり、データ駆動型のヒューマンインザループ制御および監視のための自動で客観的なツールを組み込む必要があります。ただし、現在のソリューションは、統一された全身アプローチを使用せずに、専用センサーからいくつかの特定のメトリックを抽出することに重点を置いています。リハビリテーションで使用されるスマートウォーカー機器にマウントされた重複しないビューを持つ2つのRGB + Dカメラストリームに基づいて、一般的なリアルタイムの全身ポーズ推定フレームワークを調査します。人間のキーポイント推定は、2段階のニューラルネットワークフレームワークを使用して実行されます。 2D-Stageは、2D画像フレーム内のボディキーポイントを特定する検出モジュールを実装しています。 3D-Stageは、両方のカメラで検出されたキーポイントを持ち上げて、歩行者を基準にした3D空間に関連付ける回帰モジュールを実装します。モデル予測は、時間的一貫性を向上させるためにローパスフィルター処理されます。カスタム取得方法を使用して、14人の健康な被験者を含むデータセットを取得し、提案されたフレームワークをオフラインでトレーニングおよび評価し、実際の歩行者機器に展開しました。 2Dステージで3.73ピクセル、3Dステージで44.05mmの全体的なキーポイント検出エラーが報告され、歩行者の制約されたハードウェアに展開された場合の推論時間は26.6ミリ秒でした。スマートウォーカーのコンテキストで、患者の監視とデータ駆動型のヒューマンインザループ制御への新しいアプローチを紹介します。完全でコンパクトな身体表現をリアルタイムで安価なセンサーから抽出することができ、ダウンストリームメトリック抽出ソリューションおよびヒューマンロボットインタラクションアプリケーションの共通ベースとして機能します。有望な結果にもかかわらず、実際のシナリオでのリハビリツールとしてのパフォーマンスを評価するには、障害のあるユーザーについてより多くのデータを収集する必要があります。

Rehabilitation is important to improve quality of life for mobility-impaired patients. Smart walkers are a commonly used solution that should embed automatic and objective tools for data-driven human-in-the-loop control and monitoring. However, present solutions focus on extracting few specific metrics from dedicated sensors with no unified full-body approach. We investigate a general, real-time, full-body pose estimation framework based on two RGB+D camera streams with non-overlapping views mounted on a smart walker equipment used in rehabilitation. Human keypoint estimation is performed using a two-stage neural network framework. The 2D-Stage implements a detection module that locates body keypoints in the 2D image frames. The 3D-Stage implements a regression module that lifts and relates the detected keypoints in both cameras to the 3D space relative to the walker. Model predictions are low-pass filtered to improve temporal consistency. A custom acquisition method was used to obtain a dataset, with 14 healthy subjects, used for training and evaluating the proposed framework offline, which was then deployed on the real walker equipment. An overall keypoint detection error of 3.73 pixels for the 2D-Stage and 44.05mm for the 3D-Stage were reported, with an inference time of 26.6ms when deployed on the constrained hardware of the walker. We present a novel approach to patient monitoring and data-driven human-in-the-loop control in the context of smart walkers. It is able to extract a complete and compact body representation in real-time and from inexpensive sensors, serving as a common base for downstream metrics extraction solutions, and Human-Robot interaction applications. Despite promising results, more data should be collected on users with impairments, to assess its performance as a rehabilitation tool in real-world scenarios.

updated: Mon Jun 28 2021 14:11:48 GMT+0000 (UTC)

published: Mon Jun 28 2021 14:11:48 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト