Learning Efficient Multi-Agent Cooperative Visual Exploration

Chao Yu; Xinyi Yang; Jiaxuan Gao; Huazhong Yang; Yu Wang; Yi Wu

効率的なマルチエージェント協調視覚探索の学習

複数のエージェントが視覚信号に基づいて可能な限り速く見えない領域を共同で探索する必要がある協調的な視覚探索の問題に取り組みます。従来の計画ベースの方法では、多くの場合、各ステップでの計算のオーバーヘッドが高くなり、複雑な協力戦略の表現力が制限されます。対照的に、強化学習（RL）は、任意に複雑な戦略のモデリング機能と最小限の推論オーバーヘッドにより、この課題に取り組むための一般的なパラダイムになりました。この論文では、最新のシングルエージェントビジュアルナビゲーション手法であるアクティブニューラルSLAM（ANS）を、新しいRLベースの計画モジュールであるマルチエージェント空間プランナー（ MSP）.MSPは、トランスフォーマーベースのアーキテクチャであるSpatial-TeamFormerを活用します。これは、階層的な空間的自己注意を介して、空間的関係とエージェント内の相互作用を効果的にキャプチャします。さらに、いくつかのマルチエージェント拡張機能を実装して、各エージェントからのローカル情報を処理し、位置合わせされた空間表現とより正確な計画を実現します。最後に、ポリシーの蒸留を実行してメタポリシーを抽出し、最終的なポリシーの一般化機能を大幅に向上させます。この全体的なソリューションをマルチエージェントアクティブニューラルSLAM（MAANS）と呼びます。 MAANSは、フォトリアリスティックな3Dシミュレーター、Habitatで初めて、従来の計画ベースのベースラインを大幅に上回ります。コードとビデオはhttps://sites.google.com/view/maansで見つけることができます。

We tackle the problem of cooperative visual exploration where multiple agents need to jointly explore unseen regions as fast as possible based on visual signals. Classical planning-based methods often suffer from expensive computation overhead at each step and a limited expressiveness of complex cooperation strategy. By contrast, reinforcement learning (RL) has recently become a popular paradigm for tackling this challenge due to its modeling capability of arbitrarily complex strategies and minimal inference overhead. In this paper, we extend the state-of-the-art single-agent visual navigation method, Active Neural SLAM (ANS), to the multi-agent setting by introducing a novel RL-based planning module, Multi-agent Spatial Planner (MSP).MSP leverages a transformer-based architecture, Spatial-TeamFormer, which effectively captures spatial relations and intra-agent interactions via hierarchical spatial self-attentions. In addition, we also implement a few multi-agent enhancements to process local information from each agent for an aligned spatial representation and more precise planning. Finally, we perform policy distillation to extract a meta policy to significantly improve the generalization capability of final policy. We call this overall solution, Multi-Agent Active Neural SLAM (MAANS). MAANS substantially outperforms classical planning-based baselines for the first time in a photo-realistic 3D simulator, Habitat. Code and videos can be found at https://sites.google.com/view/maans.

updated: Tue Mar 22 2022 15:09:52 GMT+0000 (UTC)

published: Tue Oct 12 2021 04:48:10 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト