End-to-End Partially Observable Visual Navigation in a Diverse Environment

Bo Ai; Wei Gao; Vinay; David Hsu

多様な環境でのエンドツーエンドの部分的に観察可能なビジュアルナビゲーション

ロボットは、屋内または屋外、オフィスの廊下や公園の小道、平らな地面、階段、エレベーターなど、豊かで多様な環境でどのようにうまくナビゲートできますか？この目的のために、この作業は3つの課題を目指しています：（i）複雑な視覚的観察、（ii）ローカルセンシングの部分的な可観測性、および（iii）ローカル環境と高レベルの目標の両方に依存するマルチモーダルナビゲーション動作。ローカルコントローラーを表し、エンドツーエンドアプローチの柔軟性を活用して強力なポリシーを学習するための新しいニューラルネットワーク（NN）アーキテクチャを提案します。複雑な視覚的観察に取り組むために、畳み込み層を介してマルチスケールの空間情報を抽出します。部分的な可観測性に対処するために、LSTMのようなモジュールで豊富な履歴情報をエンコードします。重要なのは、2つを単一の統合アーキテクチャに統合し、畳み込みメモリセルを活用して、観測とコントロールの間の複雑な時空間依存関係をキャプチャできる複数の空間スケールで観測履歴を追跡することです。さらに、さまざまなナビゲーション動作モードを生成するために、ネットワークを高レベルの目標に合わせて調整します。具体的には、学習したポリシーでモードが崩壊するのを防ぐために、さまざまなモードに独立したメモリセルを使用することを提案します。 SPOTロボットにNNコントローラーを実装し、部分的な観察を伴う3つの困難なタスク、つまり、敵対的な歩行者の回避、死角の障害物の回避、およびエレベータの乗車について評価しました。私たちのモデルは、CNN、従来のLSTM、またはモデルのアブレーションバージョンを大幅に上回っています。デモビデオが公開され、大学のキャンパス内のさまざまな場所を移動するSPOTロボットが表示されます。

How can a robot navigate successfully in a rich and diverse environment, indoors or outdoors, along an office corridor or a trail in the park, on the flat ground, the staircase, or the elevator, etc.? To this end, this work aims at three challenges: (i) complex visual observations, (ii) partial observability of local sensing, and (iii) multimodal navigation behaviors that depend on both the local environment and the high-level goal. We propose a novel neural network (NN) architecture to represent a local controller and leverage the flexibility of the end-to-end approach to learn a powerful policy. To tackle complex visual observations, we extract multiscale spatial information through convolution layers. To deal with partial observability, we encode rich history information in LSTM-like modules. Importantly, we integrate the two into a single unified architecture that exploits convolutional memory cells to track the observation history at multiple spatial scales, which can capture the complex spatiotemporal dependencies between observations and controls. We additionally condition the network on the high-level goal in order to generate different navigation behavior modes. Specifically, we propose to use independent memory cells for different modes to prevent mode collapse in the learned policy. We implemented the NN controller on the SPOT robot and evaluate it on three challenging tasks with partial observations: adversarial pedestrian avoidance, blind-spot obstacle avoidance, and elevator riding. Our model significantly outperforms CNNs, conventional LSTMs, or the ablated versions of our model. A demo video will be publicly available, showing our SPOT robot traversing many different locations on our university campus.

updated: Thu Sep 16 2021 06:53:57 GMT+0000 (UTC)

published: Thu Sep 16 2021 06:53:57 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト