Refactoring Policy for Compositional Generalizability using Self-Supervised Object Proposals

Tongzhou Mu; Jiayuan Gu; Zhiwei Jia; Hao Tang; Hao Su

自己教師ありオブジェクト提案を使用した構成の一般化可能性のためのリファクタリングポリシー

構成の一般化可能性を備えたポリシーを学習する方法を研究します。報酬の高い教師のポリシーを、強い誘導バイアスのある一般化可能な学生のポリシーにリファクタリングする2段階のフレームワークを提案します。特に、オブジェクト中心のGNNベースの学生ポリシーを実装します。このポリシーの入力オブジェクトは、自己教師あり学習を通じて画像から学習されます。経験的に、構成の一般化可能性を必要とする4つの困難なタスクに対するアプローチを評価し、ベースラインと比較して優れたパフォーマンスを実現します。

We study how to learn a policy with compositional generalizability. We propose a two-stage framework, which refactorizes a high-reward teacher policy into a generalizable student policy with strong inductive bias. Particularly, we implement an object-centric GNN-based student policy, whose input objects are learned from images through self-supervised learning. Empirically, we evaluate our approach on four difficult tasks that require compositional generalizability, and achieve superior performance compared to baselines.

updated: Mon Oct 26 2020 17:46:08 GMT+0000 (UTC)

published: Mon Oct 26 2020 17:46:08 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト