From Hand-Perspective Visual Information to Grasp Type Probabilities: Deep Learning via Ranking Labels

Mo Han; Sezen Ya{ğ}mur Günay; İlkay Yıldız; Paolo Bonato; Cagdas D. Onal; Taşkın Padır; Gunar Schirner; Deniz Erdo{ğ}muş

手から見た視覚情報からタイプ確率の把握まで：ランキングラベルによる深層学習

四肢の欠損は切断者の日常生活に深刻な影響を及ぼし、この剥奪を補うために機能的なロボット義手を提供する努力を推進しています。義手の畳み込みニューラルネットワークベースのコンピュータビジョン制御は、視覚情報をトレーニングして手のジェスチャーを予測することにより、その信頼性のために生理学的信号を置き換えるまたは補完する方法としてますます注目を集めています。義手の手のひらにカメラを取り付けることは、視覚データを収集するための有望なアプローチであることが証明されています。ただし、オブジェクトの形状が常に対称であるとは限らないため、目と手の観点からラベル付けされた把持タイプは異なる場合があります。したがって、この違いを現実的な方法で表すために、目と手の視点からの同期画像を含むデータセットを採用しました。ここでは、手の視点の画像がトレーニングに使用され、目の視点の画像は手動のラベル付けのみに使用されます。上腕からの筋電図（EMG）活動と運動運動学データも、将来の作業でのマルチモーダル情報融合のために収集されます。さらに、ヒューマンインザループ制御を組み込み、コンピュータービジョンを生理学的信号入力と組み合わせるために、絶対的な正または負の予測を行う代わりに、Plackett-Luceモデルに従って新しい確率的分類器を構築します。把握の確率分布を予測するために、ラベルのランク付けの統計モデルを利用して、手動でランク付けされた把握のリストを新しい形式のラベルとして利用し、最尤推定によって順列ドメインの問題を解決します。提案されたモデルが最も人気があり生産的な畳み込みニューラルネットワークフレームワークに適用可能であることを示します。

Limb deficiency severely affects the daily lives of amputees and drives efforts to provide functional robotic prosthetic hands to compensate this deprivation. Convolutional neural network-based computer vision control of the prosthetic hand has received increased attention as a method to replace or complement physiological signals due to its reliability by training visual information to predict the hand gesture. Mounting a camera into the palm of a prosthetic hand is proved to be a promising approach to collect visual data. However, the grasp type labelled from the eye and hand perspective may differ as object shapes are not always symmetric. Thus, to represent this difference in a realistic way, we employed a dataset containing synchronous images from eye- and hand- view, where the hand-perspective images are used for training while the eye-view images are only for manual labelling. Electromyogram (EMG) activity and movement kinematics data from the upper arm are also collected for multi-modal information fusion in future work. Moreover, in order to include human-in-the-loop control and combine the computer vision with physiological signal inputs, instead of making absolute positive or negative predictions, we build a novel probabilistic classifier according to the Plackett-Luce model. To predict the probability distribution over grasps, we exploit the statistical model over label rankings to solve the permutation domain problems via a maximum likelihood estimation, utilizing the manually ranked lists of grasps as a new form of label. We indicate that the proposed model is applicable to the most popular and productive convolutional neural network frameworks.

updated: Mon Mar 08 2021 16:12:38 GMT+0000 (UTC)

published: Mon Mar 08 2021 16:12:38 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト