CaRTS: Causality-driven Robot Tool Segmentation from Vision and Kinematics Data

Hao Ding; Jintan Zhang; Peter Kazanzides; Jieying Wu; Mathias Unberath

CaRTS：ビジョンおよびキネマティクスデータからの因果関係駆動型ロボットツールのセグメンテーション

ロボット支援手術中のロボットツールのビジョンベースのセグメンテーションにより、拡張現実フィードバックなどのダウンストリームアプリケーションが可能になり、ロボットの運動学の不正確さが許容されます。ディープラーニングの導入により、画像から直接かつ単独で機器のセグメンテーションを解決するための多くの方法が提示されました。これらのアプローチはベンチマークデータセットで目覚ましい進歩を遂げましたが、その堅牢性に関する基本的な課題は残っています。ロボットツールセグメンテーションタスクの補完的な因果モデルに基づいて設計された、因果関係駆動型ロボットツールセグメンテーションアルゴリズムであるCaRTSを紹介します。 CaRTSは、観察された画像からセグメンテーションマスクを直接推測するのではなく、順運動学と微分可能なレンダリングを通じて最初は正しくないロボットの運動学的パラメーターを更新し、画像の特徴の類似性をエンドツーエンドで最適化することにより、ツールモデルを画像観察と繰り返し調整します。反事実的合成を可能にするために正確に制御されたシナリオで生成された、dVRKからの合成データと実際のデータの両方で、競合する手法を使用してCaRTSのベンチマークを行います。トレーニングドメインのテストデータでは、CaRTSは93.4のダイススコアを達成します。これは、反事実的に変更されたテストデータでテストした場合に良好に保存され（ダイススコア91.8）、低輝度、煙、血液、および変更されたバックグラウンドパターンを示します。これは、同じデータでトレーニングおよびテストされた純粋な画像ベースの方法のダイススコアがそれぞれ95.0および62.8であるのと比べて遜色ありません。今後の作業では、CaRTSを高速化してビデオのフレームレートを達成し、オクルージョンが実際に与える影響を推定します。これらの制限にもかかわらず、私たちの結果は有望です。高いセグメンテーション精度を達成することに加えて、CaRTSは、力の推定などのアプリケーションに役立つ可能性のある真のロボット運動学の推定を提供します。

Vision-based segmentation of the robotic tool during robot-assisted surgery enables downstream applications, such as augmented reality feedback, while allowing for inaccuracies in robot kinematics. With the introduction of deep learning, many methods were presented to solve instrument segmentation directly and solely from images. While these approaches made remarkable progress on benchmark datasets, fundamental challenges pertaining to their robustness remain. We present CaRTS, a causality-driven robot tool segmentation algorithm, that is designed based on a complementary causal model of the robot tool segmentation task. Rather than directly inferring segmentation masks from observed images, CaRTS iteratively aligns tool models with image observations by updating the initially incorrect robot kinematic parameters through forward kinematics and differentiable rendering to optimize image feature similarity end-to-end. We benchmark CaRTS with competing techniques on both synthetic as well as real data from the dVRK, generated in precisely controlled scenarios to allow for counterfactual synthesis. On training-domain test data, CaRTS achieves a Dice score of 93.4 that is preserved well (Dice score of 91.8) when tested on counterfactual altered test data, exhibiting low brightness, smoke, blood, and altered background patterns. This compares favorably to Dice scores of 95.0 and 62.8, respectively, of a purely image-based method trained and tested on the same data. Future work will involve accelerating CaRTS to achieve video framerate and estimating the impact occlusion has in practice. Despite these limitations, our results are promising: In addition to achieving high segmentation accuracy, CaRTS provides estimates of the true robot kinematics, which may benefit applications such as force estimation.

updated: Tue Mar 15 2022 22:26:19 GMT+0000 (UTC)

published: Tue Mar 15 2022 22:26:19 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト