Learning of feature points without additional supervision improves reinforcement learning from images

Rinu Boney; Alexander Ilin; Juho Kannala

追加の監視なしで特徴点を学習すると、画像からの強化学習が向上します

視覚を含む多くの制御問題では、シーン内のオブジェクトの位置から最適な制御を推測できます。この情報は、入力画像の学習された特徴マップ内の空間位置のリストである特徴点を使用して表すことができます。以前の作品は、教師なし事前トレーニングまたは人間の監督を使用して学習した特徴点が、制御タスクに優れた特徴を提供できることを示しています。このホワイトペーパーでは、教師なし事前トレーニング、デコーダー、または追加の損失を必要とせずに、効率的な特徴点表現をエンドツーエンドで学習できることを示します。私たちが提案するアーキテクチャは、推定された特徴点の座標をソフトアクター批評家に直接供給する微分可能な特徴点抽出器で構成されています。提案されたアルゴリズムは、DeepMind ControlSuiteタスクの最先端に匹敵するパフォーマンスをもたらします。

In many control problems that include vision, optimal controls can be inferred from the location of the objects in the scene. This information can be represented using feature points, which is a list of spatial locations in learned feature maps of an input image. Previous works show that feature points learned using unsupervised pre-training or human supervision can provide good features for control tasks. In this paper, we show that it is possible to learn efficient feature point representations end-to-end, without the need for unsupervised pre-training, decoders, or additional losses. Our proposed architecture consists of a differentiable feature point extractor that feeds the coordinates of the estimated feature points directly to a soft actor-critic agent. The proposed algorithm yields performance competitive to the state-of-the art on DeepMind Control Suite tasks.

updated: Wed Dec 01 2021 16:16:12 GMT+0000 (UTC)

published: Tue Jun 15 2021 09:17:06 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト