Graphonomy: Universal Image Parsing via Graph Reasoning and Transfer

Liang Lin; Yiming Gao; Ke Gong; Meng Wang; Xiaodan Liang

グラフノミー：グラフの推論と転送によるユニバーサル画像解析

以前の高度に調整された画像解析モデルは、通常、特定のセマンティックラベルのセットを使用して特定のドメインで調査され、大規模な再トレーニングなしでは他のシナリオ（たとえば、不一致のラベル粒度の共有）に適応することはほとんどできません。さまざまなドメインまたはさまざまなレベルの粒度でラベル注釈を統合することにより、単一のユニバーサル解析モデルを学習することは重要ですが、めったに取り上げられないトピックです。これは、多くの基本的な学習の課題をもたらします。たとえば、さまざまなラベルの粒度の中で基礎となるセマンティック構造を発見したり、関連するタスク間でラベルの相関をマイニングしたりします。これらの課題に対処するために、「Graphonomy」という名前のグラフ推論および転送学習フレームワークを提案します。これは、人間の知識とラベル分類法を、ローカル畳み込みを超えた中間グラフ表現学習に組み込みます。特に、Graphonomyは、セマンティックを意識したグラフの推論と転送を介して、複数のドメインのグローバルで構造化されたセマンティックコヒーレンシを学習し、ドメイン間での解析の相互利益を強制します（たとえば、異なるデータセットまたは相互に関連するタスク）。 Graphonomyには、グラフ内推論モジュールとグラフ間転送モジュールの2つの反復モジュールが含まれています。前者は、各ドメインのセマンティックグラフを抽出し、グラフを使用して情報を伝播することにより、特徴表現の学習を改善します。後者は、双方向の知識伝達のために、異なるドメインからのグラフ間の依存関係を利用します。 Graphonomyを、人間の構文解析とパノラマセグメンテーションという2つの関連するが異なる画像理解の研究トピックに適用し、Graphonomyが現在の最先端のアプローチに対して標準パイプラインを介して両方をうまく処理できることを示します。さらに、フレームワークのいくつかの追加の利点が示されています。たとえば、さまざまなデータセット間で注釈を統合することにより、さまざまなレベルの粒度で人間の解析を生成します。

Prior highly-tuned image parsing models are usually studied in a certain domain with a specific set of semantic labels and can hardly be adapted into other scenarios (e.g., sharing discrepant label granularity) without extensive re-training. Learning a single universal parsing model by unifying label annotations from different domains or at various levels of granularity is a crucial but rarely addressed topic. This poses many fundamental learning challenges, e.g., discovering underlying semantic structures among different label granularity or mining label correlation across relevant tasks. To address these challenges, we propose a graph reasoning and transfer learning framework, named "Graphonomy", which incorporates human knowledge and label taxonomy into the intermediate graph representation learning beyond local convolutions. In particular, Graphonomy learns the global and structured semantic coherency in multiple domains via semantic-aware graph reasoning and transfer, enforcing the mutual benefits of the parsing across domains (e.g., different datasets or co-related tasks). The Graphonomy includes two iterated modules: Intra-Graph Reasoning and Inter-Graph Transfer modules. The former extracts the semantic graph in each domain to improve the feature representation learning by propagating information with the graph; the latter exploits the dependencies among the graphs from different domains for bidirectional knowledge transfer. We apply Graphonomy to two relevant but different image understanding research topics: human parsing and panoptic segmentation, and show Graphonomy can handle both of them well via a standard pipeline against current state-of-the-art approaches. Moreover, some extra benefit of our framework is demonstrated, e.g., generating the human parsing at various levels of granularity by unifying annotations across different datasets.

updated: Tue Jan 26 2021 08:19:03 GMT+0000 (UTC)

published: Tue Jan 26 2021 08:19:03 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト