Coarse-to-Fine Q-attention: Efficient Learning for Visual Robotic Manipulation via Discretisation

Stephen James; Kentaro Wada; Tristan Laidlow; Andrew J. Davison

粗いものから細かいものへのQ-注意：離散化による視覚的ロボット操作のための効率的な学習

連続ロボティクス領域で不安定でデータ非効率的なアクター批評家の方法の代わりに、離散強化学習アプローチの使用を可能にする粗いものから細かいものへの離散化方法を提示します。このアプローチは、最近リリースされたARMアルゴリズムに基づいています。このアルゴリズムは、連続的な次善のポーズエージェントを、粗いものから細かいものへのQアテンションを備えた個別のものに置き換えます。ボクセル化されたシーンが与えられると、粗いものから細かいものへのQ-attentionは、シーンのどの部分に「ズームイン」するかを学習します。この「ズーム」動作を繰り返し適用すると、変換空間のほぼロスレスの離散化が実現し、離散アクションの詳細なQ学習方法を使用できるようになります。私たちの新しい粗いアルゴリズムが、いくつかの難しいまばらに報酬を与えられたRLBenchビジョンベースのロボット工学タスクで最先端のパフォーマンスを達成し、実際のポリシーであるタブララサを数分でトレーニングできることを示します。わずか3つのデモンストレーション。

We present a coarse-to-fine discretisation method that enables the use of discrete reinforcement learning approaches in place of unstable and data-inefficient actor-critic methods in continuous robotics domains. This approach builds on the recently released ARM algorithm, which replaces the continuous next-best pose agent with a discrete one, with coarse-to-fine Q-attention. Given a voxelised scene, coarse-to-fine Q-attention learns what part of the scene to 'zoom' into. When this 'zooming' behaviour is applied iteratively, it results in a near-lossless discretisation of the translation space, and allows the use of a discrete action, deep Q-learning method. We show that our new coarse-to-fine algorithm achieves state-of-the-art performance on several difficult sparsely rewarded RLBench vision-based robotics tasks, and can train real-world policies, tabula rasa, in a matter of minutes, with as little as 3 demonstrations.

updated: Tue Mar 15 2022 00:33:43 GMT+0000 (UTC)

published: Wed Jun 23 2021 16:57:16 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト