Reinforcement Learning with Action-Free Pre-Training from Videos

Younggyo Seo; Kimin Lee; Stephen James; Pieter Abbeel

ビデオからのアクションフリーの事前トレーニングによる強化学習

最近の教師なし事前トレーニング方法は、複数のダウンストリームタスクの有用な表現を学習することにより、言語およびビジョンドメインで効果的であることが示されています。この論文では、そのような教師なし事前トレーニング方法が視覚ベースの強化学習（RL）にも有効であるかどうかを調査します。この目的のために、ビデオの生成的な事前トレーニングを介してダイナミクスを理解するのに役立つ表現を学習するフレームワークを紹介します。私たちのフレームワークは2つのフェーズで構成されています。アクションのない潜在ビデオ予測モデルを事前にトレーニングし、事前にトレーニングされた表現を利用して、見えない環境でアクション条件付きの世界モデルを効率的に学習します。微調整中に追加のアクション入力を組み込むために、事前にトレーニングされたアクションフリー予測モデルの上にアクション条件付き潜在予測モデルをスタックする新しいアーキテクチャを導入します。さらに、より良い探索のために、事前にトレーニングされた表現を活用するビデオベースの固有のボーナスを提案します。私たちのフレームワークは、さまざまな操作および移動タスクにおいて、視覚ベースのRLの最終的なパフォーマンスとサンプル効率の両方を大幅に向上させることを示しています。コードはhttps://github.com/younggyoseo/apvで入手できます。

Recent unsupervised pre-training methods have shown to be effective on language and vision domains by learning useful representations for multiple downstream tasks. In this paper, we investigate if such unsupervised pre-training methods can also be effective for vision-based reinforcement learning (RL). To this end, we introduce a framework that learns representations useful for understanding the dynamics via generative pre-training on videos. Our framework consists of two phases: we pre-train an action-free latent video prediction model, and then utilize the pre-trained representations for efficiently learning action-conditional world models on unseen environments. To incorporate additional action inputs during fine-tuning, we introduce a new architecture that stacks an action-conditional latent prediction model on top of the pre-trained action-free prediction model. Moreover, for better exploration, we propose a video-based intrinsic bonus that leverages pre-trained representations. We demonstrate that our framework significantly improves both final performances and sample-efficiency of vision-based RL in a variety of manipulation and locomotion tasks. Code is available at https://github.com/younggyoseo/apv.

updated: Thu Jun 16 2022 22:01:25 GMT+0000 (UTC)

published: Fri Mar 25 2022 19:44:09 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト