Semantically Controllable Generation of Physical Scenes with Explicit Knowledge

Wenhao Ding; Bo Li; Kim Ji Eun; Ding Zhao

形式知を用いた意味的に制御可能な物理シーンの生成

深層生成モデル（DGM）は、現実的なデータを生成する優れた機能で知られています。純粋にデータ駆動型のアプローチを拡張する最近の特殊なDGMは、ニューロンまたは機能レベルで暗黙的にパターンを操作することにより、運転シーンに交通標識を埋め込むなど、追加の制御可能な要件を満たすことができます。この論文では、意味的に制御可能なシーン生成を実現するために、ドメイン知識を生成プロセスに明示的に組み込む新しい方法を紹介します。自然のシーンの構成と一致するように、知識を2つのタイプに分類します。最初のタイプはオブジェクトのプロパティを表し、2番目のタイプはオブジェクト間の関係を表します。次に、ノードとエッジがそれぞれ2つのタイプの知識に自然に対応する複雑なシーン表現を学習するために、ツリー構造の生成モデルを提案します。ツリー構造のノードとエッジのプロパティにセマンティックルールを課すことにより、知識を明示的に統合して、セマンティックに制御可能なシーンの生成を可能にすることができます。クリーンな設定でのメソッドの可制御性と説明可能性を説明するために、合成例を作成します。さらに、合成例を現実的な自動運転車の運転環境に拡張し、広範な実験を行って、形式知として指定された交通ルールを満たすさまざまな最先端の3Dポイントクラウドセグメンテーションモデルに対して、敵の交通シーンを効率的に識別することを示します。

Deep Generative Models (DGMs) are known for their superior capability in generating realistic data. Extending purely data-driven approaches, recent specialized DGMs may satisfy additional controllable requirements such as embedding a traffic sign in a driving scene, by manipulating patterns implicitly in the neuron or feature level. In this paper, we introduce a novel method to incorporate domain knowledge explicitly in the generation process to achieve semantically controllable scene generation. We categorize our knowledge into two types to be consistent with the composition of natural scenes, where the first type represents the property of objects and the second type represents the relationship among objects. We then propose a tree-structured generative model to learn complex scene representation, whose nodes and edges are naturally corresponding to the two types of knowledge respectively. Knowledge can be explicitly integrated to enable semantically controllable scene generation by imposing semantic rules on properties of nodes and edges in the tree structure. We construct a synthetic example to illustrate the controllability and explainability of our method in a clean setting. We further extend the synthetic example to realistic autonomous vehicle driving environments and conduct extensive experiments to show that our method efficiently identifies adversarial traffic scenes against different state-of-the-art 3D point cloud segmentation models satisfying the traffic rules specified as the explicit knowledge.

updated: Thu Oct 28 2021 03:13:36 GMT+0000 (UTC)

published: Tue Jun 08 2021 02:51:33 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト