RoboTAP: Tracking Arbitrary Points for Few-Shot Visual Imitation

Mel Vecerik; Carl Doersch; Yi Yang; Todor Davchev; Yusuf Aytar; Guangyao Zhou; Raia Hadsell; Lourdes Agapito; Jon Scholz

RoboTAP: 数ショットの視覚的模倣のための任意の点の追跡

ロボットが研究室や専門工場の外で役立つためには、新しい有用な動作をロボットに迅速に教える方法が必要です。現在のアプローチには、タスク固有のエンジニアリングを行わずに新しいタスクをオンボードするための汎用性が欠けているか、実用化を可能にする量の時間内で新しいタスクをオンボードするためのデータ効率が欠けています。この研究では、デモンストレーションからより迅速かつより一般的な学習を可能にする表現手段としての密な追跡を検討します。私たちのアプローチでは、Track-Any-Point (TAP) モデルを利用して、デモンストレーション内の関連するモーションを分離し、低レベルのコントローラーをパラメーター化して、シーン構成の変化全体でこのモーションを再現します。この結果は、形状のマッチング、積み重ねなどの複雑なオブジェクト配置タスク、さらには接着剤の塗布やオブジェクトの貼り合わせなどの完全なパス追跡タスクを解決できる堅牢なロボットポリシーを示すものであり、これらはすべて数分で収集できるデモンストレーションから行われます。

For robots to be useful outside labs and specialized factories we need a way to teach them new useful behaviors quickly. Current approaches lack either the generality to onboard new tasks without task-specific engineering, or else lack the data-efficiency to do so in an amount of time that enables practical use. In this work we explore dense tracking as a representational vehicle to allow faster and more general learning from demonstration. Our approach utilizes Track-Any-Point (TAP) models to isolate the relevant motion in a demonstration, and parameterize a low-level controller to reproduce this motion across changes in the scene configuration. We show this results in robust robot policies that can solve complex object-arrangement tasks such as shape-matching, stacking, and even full path-following tasks such as applying glue and sticking objects together, all from demonstrations that can be collected in minutes.

updated: Wed Aug 30 2023 11:57:04 GMT+0000 (UTC)

published: Wed Aug 30 2023 11:57:04 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト