Deep learning-based approaches for human motion decoding in smart walkers for rehabilitation

Carolina Gonçalves; João M. Lopes; Sara Moccia; Daniele Berardini; Lucia Migliorelli; Cristina P. Santos

リハビリテーション用のスマートウォーカーにおける人間の動きを解読するための深層学習ベースのアプローチ

歩行障害は、世界で最も多いものの 1 つです。彼らの治療は、臨床医の労力を減らしながら、ユーザーの回復と自律性を強化するためにスマートウォーカーが導入されているリハビリテーション療法に依存しています。そのために、これらはできるだけ早く人間の動きとニーズを解読できる必要があります。現在の歩行者は、ウェアラブルまたは組み込みセンサー、つまり慣性ユニット、力およびホールセンサー、レーザーの情報を使用して動作の意図を解読します。これらの主な制限は、高価なソリューションを意味するか、人間の動きの認識を妨げます。通常、スマートウォーカーには、人間の動きを直感的に理解するシームレスな人間とロボットの相互作用が欠けています。この作業では非接触アプローチが提案されており、RGB-D カメラを使用して、人間の動きのデコードを初期のアクション認識/検出の問題として扱います。スマートウォーカーの組み込みカメラから記録された下半身の RGB-D ビデオシーケンスを処理し、それらを 4 つのクラス (停止、歩行、右/左折) に分類するために、3 つの異なるアプローチで編成された、さまざまな深層学習ベースのアルゴリズムを研究しました。）。デバイスを持って歩いている 15 人の健康な参加者を含むカスタムデータセットが取得および準備され、その結果、28800 のバランスのとれた RGB-D フレームが作成され、ディープネットワークのトレーニングと評価が行われました。最良の結果は、チャネルアテンションメカニズムを備えた畳み込みニューラルネットワークによって達成され、オフラインの早期検出/認識とトライアルシミュレーションで、それぞれ 99.61% と 93% を超える精度値に達しました。人間の下半身の特徴が顕著な情報をエンコードし、リアルタイムアプリケーションに向けてより堅牢な予測を促進するという仮説に従って、アルゴリズムの焦点もダイスメトリックを使用して評価され、30% よりわずかに高い値が得られました。提案されたアーキテクチャに焦点を当てた強化により、人間の動きのデコード戦略としての早期行動検出について有望な結果が得られました。

Gait disabilities are among the most frequent worldwide. Their treatment relies on rehabilitation therapies, in which smart walkers are being introduced to empower the user's recovery and autonomy, while reducing the clinicians effort. For that, these should be able to decode human motion and needs, as early as possible. Current walkers decode motion intention using information of wearable or embedded sensors, namely inertial units, force and hall sensors, and lasers, whose main limitations imply an expensive solution or hinder the perception of human movement. Smart walkers commonly lack a seamless human-robot interaction, which intuitively understands human motions. A contactless approach is proposed in this work, addressing human motion decoding as an early action recognition/detection problematic, using RGB-D cameras. We studied different deep learning-based algorithms, organised in three different approaches, to process lower body RGB-D video sequences, recorded from an embedded camera of a smart walker, and classify them into 4 classes (stop, walk, turn right/left). A custom dataset involving 15 healthy participants walking with the device was acquired and prepared, resulting in 28800 balanced RGB-D frames, to train and evaluate the deep networks. The best results were attained by a convolutional neural network with a channel attention mechanism, reaching accuracy values of 99.61% and above 93%, for offline early detection/recognition and trial simulations, respectively. Following the hypothesis that human lower body features encode prominent information, fostering a more robust prediction towards real-time applications, the algorithm focus was also evaluated using Dice metric, leading to values slightly higher than 30%. Promising results were attained for early action detection as a human motion decoding strategy, with enhancements in the focus of the proposed architectures.

updated: Fri Jan 13 2023 14:29:44 GMT+0000 (UTC)

published: Fri Jan 13 2023 14:29:44 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト