Hybrid system identification using switching density networks

Michael Burke; Yordan Hristov; Subramanian Ramamoorthy

スイッチング密度ネットワークを使用したハイブリッドシステムの識別

行動クローニングは、模倣学習で一般的に使用される戦略であり、制約のあるドメインで非常に効果的です。ただし、環境のダイナミクスが状態に依存して変化する可能性がある場合、動作のクローン作成はモデルの容量と必要なデモの数に負担をかけます。このホワイトペーパーでは、ハイブリッドシステムの識別のためにカテゴリ再パラメータ化に依存するスイッチング密度ネットワークを紹介します。これにより、回帰層が続く分類層を含むネットワークが作成されます。スイッチング密度ネットワークを使用して、ハイブリッド制御則のパラメーターを予測します。これは、入力状態で条件付けられると、異なるコントローラー出力を生成するためにスイッチング層によって切り替えられます。この作業では、さまざまなタスクでハイブリッドシステムの識別にスイッチング密度ネットワークを使用して、操作タスクを構成する主要な関節角度の目標を正常に識別し、同時に画像から目標角度分類器と画像から関節角度を予測する回帰ネットワークを学習する方法を示します。また、倒立振子の位相空間をクラスター化し、このタスクを解決するために必要なバランス、スピン、およびポンプコントローラーを特定できることも示します。スイッチング密度ネットワークはトレーニングが困難な場合がありますが、トレーニングを安定化させるクロスエントロピー正則化損失が発生します。

Behaviour cloning is a commonly used strategy for imitation learning and can be extremely effective in constrained domains. However, in cases where the dynamics of an environment may be state dependent and varying, behaviour cloning places a burden on model capacity and the number of demonstrations required. This paper introduces switching density networks, which rely on a categorical reparametrisation for hybrid system identification. This results in a network comprising a classification layer that is followed by a regression layer. We use switching density networks to predict the parameters of hybrid control laws, which are toggled by a switching layer to produce different controller outputs, when conditioned on an input state. This work shows how switching density networks can be used for hybrid system identification in a variety of tasks, successfully identifying the key joint angle goals that make up manipulation tasks, while simultaneously learning image-based goal classifiers and regression networks that predict joint angles from images. We also show that they can cluster the phase space of an inverted pendulum, identifying the balance, spin and pump controllers required to solve this task. Switching density networks can be difficult to train, but we introduce a cross entropy regularisation loss that stabilises training.

updated: Wed Sep 18 2019 10:04:09 GMT+0000 (UTC)

published: Tue Jul 09 2019 18:31:51 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト