Rethinking Closed-loop Training for Autonomous Driving

Chris Zhang; Runsheng Guo; Wenyuan Zeng; Yuwen Xiong; Binbin Dai; Rui Hu; Mengye Ren; Raquel Urtasun

自動運転のためのクローズドループトレーニングを再考する

最近の高忠実度シミュレーターの進歩により、自動運転エージェントの閉ループトレーニングが可能になり、トレーニングと展開における分布の変化が解決される可能性があり、トレーニングを安全かつ安価に拡張できるようになりました。ただし、クローズドループトレーニング用の効果的なトレーニングベンチマークを構築する方法については理解が不足しています。この研究では、トラフィックシナリオの設計方法やトレーニング環境の拡張方法など、学習エージェントの成功に対するさまざまなトレーニングベンチマーク設計の影響を分析する最初の実証研究を紹介します。さらに、多くの一般的な RL アルゴリズムは、長期的な計画が欠如しており、トレーニングに非常に長い時間がかかるため、自動運転のコンテキストでは満足のいくパフォーマンスを達成できないことを示します。これらの問題に対処するために、私たちは、多段階の先読みで計画を実行し、効率的な学習のために安価に生成された想像データを活用する、RL ベースの駆動エージェントである軌跡値学習 (TRAVL) を提案します。私たちの実験は、TRAVL がすべてのベースラインと比較してはるかに速く学習し、より安全な操作を行うことができることを示しています。詳細については、プロジェクトの Web サイトをご覧ください: https://waabi.ai/research/travl

Recent advances in high-fidelity simulators have enabled closed-loop training of autonomous driving agents, potentially solving the distribution shift in training v.s. deployment and allowing training to be scaled both safely and cheaply. However, there is a lack of understanding of how to build effective training benchmarks for closed-loop training. In this work, we present the first empirical study which analyzes the effects of different training benchmark designs on the success of learning agents, such as how to design traffic scenarios and scale training environments. Furthermore, we show that many popular RL algorithms cannot achieve satisfactory performance in the context of autonomous driving, as they lack long-term planning and take an extremely long time to train. To address these issues, we propose trajectory value learning (TRAVL), an RL-based driving agent that performs planning with multistep look-ahead and exploits cheaply generated imagined data for efficient learning. Our experiments show that TRAVL can learn much faster and produce safer maneuvers compared to all the baselines. For more information, visit the project website: https://waabi.ai/research/travl

updated: Tue Jun 27 2023 17:58:39 GMT+0000 (UTC)

published: Tue Jun 27 2023 17:58:39 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト