Layout-Graph Reasoning for Fashion Landmark Detection

Weijiang Yu; Xiaodan Liang; Ke Gong; Chenhan Jiang; Nong Xiao; Liang Lin

ファッションのランドマーク検出のためのレイアウトグラフ推論

衣服分析の基本的な手法として、多様な衣服の密集したランドマークを検出することは、その巨大な応用可能性のために研究の注目を集めています。ただし、ランドマーク間の基本的なセマンティックレイアウト制約のモデリングの欠如により、以前の研究では、多くの場合、1人の複数のオーバーラップした衣服の曖昧で構造に一貫性のないランドマークを検出します。本稿では、複数のスタックレイアウトグラフ推論レイヤーを介して、中間表現上のランドマーク間の構造レイアウト関係をシームレスに適用することを提案します。レイアウトグラフは、ルートノード、身体部分ノード（上半身、下半身など）、粗い衣服部分ノード（首輪、袖など）、リーフランドマークノード（左襟、右首など）を含む階層構造として定義します。 -襟）。各Layout-Graph Reasoning（LGR）レイヤーは、Map-to-Nodeモジュールを介してフィーチャ表現を構造グラフノードにマッピングし、構造グラフノードを介して推論を実行し、レイアウトグラフ推論モジュールを介してグローバルレイアウトの一貫性を実現し、グラフをマッピングしますNode-to-Mapモジュールを介して、機能表現を強化するためにノードを戻します。レイアウトグラフ推論モジュールは、グラフクラスタリング操作を統合して、中間ノード（ボトムアップ推論）の表現を生成し、次にグラフ全体でグラフデコンボリューション操作（トップダウン推論）を生成します。 2つのパブリックファッションランドマークデータセットの広範な実験は、モデルの優位性を示しています。さらに、より包括的な服の生成と属性認識をサポートするためのきめの細かいファッションランドマークの研究を進めるために、13種類の服の最大32のキーポイントで注釈が付けられた200kの画像を含む最初のきめの細かいファッションランドマークデータセット（FFLD）を提供します。

Detecting dense landmarks for diverse clothes, as a fundamental technique for clothes analysis, has attracted increasing research attention due to its huge application potential. However, due to the lack of modeling underlying semantic layout constraints among landmarks, prior works often detect ambiguous and structure-inconsistent landmarks of multiple overlapped clothes in one person. In this paper, we propose to seamlessly enforce structural layout relationships among landmarks on the intermediate representations via multiple stacked layout-graph reasoning layers. We define the layout-graph as a hierarchical structure including a root node, body-part nodes (e.g. upper body, lower body), coarse clothes-part nodes (e.g. collar, sleeve) and leaf landmark nodes (e.g. left-collar, right-collar). Each Layout-Graph Reasoning(LGR) layer aims to map feature representations into structural graph nodes via a Map-to-Node module, performs reasoning over structural graph nodes to achieve global layout coherency via a layout-graph reasoning module, and then maps graph nodes back to enhance feature representations via a Node-to-Map module. The layout-graph reasoning module integrates a graph clustering operation to generate representations of intermediate nodes (bottom-up inference) and then a graph deconvolution operation (top-down inference) over the whole graph. Extensive experiments on two public fashion landmark datasets demonstrate the superiority of our model. Furthermore, to advance the fine-grained fashion landmark research for supporting more comprehensive clothes generation and attribute recognition, we contribute the first Fine-grained Fashion Landmark Dataset (FFLD) containing 200k images annotated with at most 32 key-points for 13 clothes types.

updated: Fri Oct 04 2019 12:59:16 GMT+0000 (UTC)

published: Fri Oct 04 2019 12:59:16 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト