Zero-shot object goal visual navigation

Qianfan Zhao; Lu Zhang; Bin He; Hong Qiao; Zhiyong Liu

ゼロショットオブジェクトゴールビジュアルナビゲーション

オブジェクトゴールビジュアルナビゲーションは、ロボットが視覚的観察のみに基づいてターゲットオブジェクトを見つけるように誘導することを目的とした挑戦的なタスクであり、ターゲットはトレーニング段階で指定されたクラスに制限されます。ただし、実際の家庭では、ロボットが処理する必要のあるオブジェクトクラスが多数存在する可能性があり、これらのクラスすべてをトレーニング段階に含めることは困難です。この課題に対処するために、ゼロショット学習とオブジェクトゴールビジュアルナビゲーションを組み合わせてゼロショットオブジェクトナビゲーションタスクを提案します。これは、トレーニングサンプルなしでロボットが新しいクラスに属するオブジェクトを見つけるように誘導することを目的としています。このタスクにより、学習したポリシーを新しいクラスに一般化する必要が生じます。これは、深層強化学習を使用したオブジェクトナビゲーションのあまり対処されていない問題です。この問題に対処するために、トレーニング段階で指定されたクラスの過剰適合を軽減するための入力として「クラスに関係のない」データを利用します。クラスに関係のない入力は、単語の埋め込みの検出結果と余弦の類似性で構成され、クラスに関連する視覚的特徴や知識グラフは含まれていません。 AI2-THORプラットフォームでの広範な実験は、モデルが見えているクラスと見えていないクラスの両方でベースラインモデルよりも優れていることを示しています。これは、モデルのクラス感度が低く、一般化が優れていることを示しています。私たちのコードはhttps://github.com/pioneer-innovation/Zero-Shot-Object-Navigationで入手できます

Object goal visual navigation is a challenging task that aims to guide a robot to find the target object only based on its visual observation, and the target is limited to the classes specified in the training stage. However, in real households, there may exist numerous object classes that the robot needs to deal with, and it is hard for all of these classes to be contained in the training stage. To address this challenge, we propose a zero-shot object navigation task by combining zero-shot learning with object goal visual navigation, which aims at guiding robots to find objects belonging to novel classes without any training samples. This task gives rise to the need to generalize the learned policy to novel classes, which is a less addressed issue of object navigation using deep reinforcement learning. To address this issue, we utilize "class-unrelated" data as input to alleviate the overfitting of the classes specified in the training stage. The class-unrelated input consists of detection results and cosine similarity of word embeddings, and does not contain any class-related visual features or knowledge graphs. Extensive experiments on the AI2-THOR platform show that our model outperforms the baseline models in both seen and unseen classes, which proves that our model is less class-sensitive and generalizes better. Our code is available at https://github.com/pioneer-innovation/Zero-Shot-Object-Navigation

updated: Wed Jun 15 2022 09:53:43 GMT+0000 (UTC)

published: Wed Jun 15 2022 09:53:43 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト