Semantically Controllable Scene Generation with Guidance of Explicit Knowledge

Wenhao Ding; Bo Li; Kim Ji Eun; Ding Zhao

形式知を用いた意味制御可能なシーン生成

Deep Generative Models (DGM) は、現実的なデータを生成する優れた機能で知られています。純粋にデータ駆動型のアプローチを拡張して、最近の特殊な DGM は、ニューロンまたは機能レベルでパターンを暗黙的に操作することにより、運転シーンに交通標識を埋め込むなどの追加の制御可能な要件を満たすことができます。この論文では、意味的に制御可能なシーン生成を実現するために、生成プロセスにドメイン知識を明示的に組み込む新しい方法を紹介します。自然風景の構成と調和するように、知識を 2 つのタイプに分類します。最初のタイプはオブジェクトのプロパティを表し、2 番目のタイプはオブジェクト間の関係を表します。次に、ノードとエッジがそれぞれ2種類の知識に自然に対応する複雑なシーン表現を学習するためのツリー構造生成モデルを提案します。ツリー構造のノードとエッジのプロパティにセマンティックルールを課すことにより、知識を明示的に統合して、セマンティックに制御可能なシーンの生成を可能にすることができます。クリーンな設定での方法の制御可能性と説明可能性を説明する合成例を作成します。さらに、合成例を現実的な自動運転車の運転環境に拡張し、広範な実験を行って、形式知として指定された交通ルールを満たすさまざまな最先端の 3D 点群セグメンテーションモデルに対して敵対的な交通シーンを効率的に識別できることを示します。

Deep Generative Models (DGMs) are known for their superior capability in generating realistic data. Extending purely data-driven approaches, recent specialized DGMs may satisfy additional controllable requirements such as embedding a traffic sign in a driving scene, by manipulating patterns implicitly in the neuron or feature level. In this paper, we introduce a novel method to incorporate domain knowledge explicitly in the generation process to achieve semantically controllable scene generation. We categorize our knowledge into two types to be consistent with the composition of natural scenes, where the first type represents the property of objects and the second type represents the relationship among objects. We then propose a tree-structured generative model to learn complex scene representation, whose nodes and edges are naturally corresponding to the two types of knowledge respectively. Knowledge can be explicitly integrated to enable semantically controllable scene generation by imposing semantic rules on properties of nodes and edges in the tree structure. We construct a synthetic example to illustrate the controllability and explainability of our method in a clean setting. We further extend the synthetic example to realistic autonomous vehicle driving environments and conduct extensive experiments to show that our method efficiently identifies adversarial traffic scenes against different state-of-the-art 3D point cloud segmentation models satisfying the traffic rules specified as the explicit knowledge.

updated: Tue Jun 08 2021 02:51:33 GMT+0000 (UTC)

published: Tue Jun 08 2021 02:51:33 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト