Masked World Models for Visual Control

Younggyo Seo; Danijar Hafner; Hao Liu; Fangchen Liu; Stephen James; Kimin Lee; Pieter Abbeel

視覚的制御のためのマスクされた世界モデル

視覚モデルベースの強化学習（RL）は、視覚的観察からサンプル効率の高いロボット学習を可能にする可能性があります。しかし、現在のアプローチでは通常、視覚的表現とダイナミクスの両方を学習するために単一のモデルをエンドツーエンドでトレーニングするため、ロボットと小さなオブジェクト間の相互作用を正確にモデル化することは困難です。この作業では、視覚表現学習とダイナミクス学習を分離する視覚モデルベースのRLフレームワークを紹介します。具体的には、畳み込み層とビジョントランスフォーマー（ViT）を使用してオートエンコーダーをトレーニングし、マスクされた畳み込み特徴が与えられたピクセルを再構築し、オートエンコーダーからの表現で動作する潜在ダイナミクスモデルを学習します。さらに、タスク関連情報をエンコードするために、オートエンコーダーの補助報酬予測目標を導入します。環境の相互作用から収集されたオンラインサンプルを使用して、オートエンコーダとダイナミクスモデルの両方を継続的に更新します。デカップリングアプローチが、Meta-worldおよびRLBenchのさまざまな視覚ロボットタスクで最先端のパフォーマンスを達成することを示します。たとえば、Meta-worldの50の視覚ロボット操作タスクで81.7％の成功率を達成します。ベースラインは67.9％を達成します。コードはプロジェクトのウェブサイトで入手できます：https：//sites.google.com/view/mwm-rl。

Visual model-based reinforcement learning (RL) has the potential to enable sample-efficient robot learning from visual observations. Yet the current approaches typically train a single model end-to-end for learning both visual representations and dynamics, making it difficult to accurately model the interaction between robots and small objects. In this work, we introduce a visual model-based RL framework that decouples visual representation learning and dynamics learning. Specifically, we train an autoencoder with convolutional layers and vision transformers (ViT) to reconstruct pixels given masked convolutional features, and learn a latent dynamics model that operates on the representations from the autoencoder. Moreover, to encode task-relevant information, we introduce an auxiliary reward prediction objective for the autoencoder. We continually update both autoencoder and dynamics model using online samples collected from environment interaction. We demonstrate that our decoupling approach achieves state-of-the-art performance on a variety of visual robotic tasks from Meta-world and RLBench, e.g., we achieve 81.7% success rate on 50 visual robotic manipulation tasks from Meta-world, while the baseline achieves 67.9%. Code is available on the project website: https://sites.google.com/view/mwm-rl.

updated: Tue Nov 15 2022 05:13:34 GMT+0000 (UTC)

published: Tue Jun 28 2022 18:42:27 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト