Graph Jigsaw Learning for Cartoon Face Recognition

Yong Li; Lingjie Lao; Zhen Cui; Shiguang Shan; Jian Yang

漫画の顔認識のためのグラフジグソー学習

漫画の顔認識は、通常、滑らかな色領域と強調されたエッジを持っているため、困難です。漫画の顔を認識するための鍵は、まばらで重要な形状パターンを正確に認識することです。ただし、畳み込みニューラルネットワーク（CNN）を使用した漫画の顔認識の形状指向表現を学習することは非常に困難です。この問題を軽減するために、分類ネットワークのさまざまな段階でジグソーパズルを構築し、グラフ畳み込みネットワーク（GCN）を使用してパズルを段階的に解決するGraphJigsawを提案します。パズルを解くには、テクスチャ情報が非常に限られているため、モデルが漫画の顔の形状パターンを見つける必要があります。 GraphJigsawの重要なアイデアは、空間次元で中間畳み込み特徴マップをランダムにシャッフルし、GCNを利用して、自己監視方式でジグソーフラグメントの正しいレイアウトを推論および復元することにより、ジグソーパズルを構築することです。提案されたGraphJigsawは、ノイズの多いパターンを導入し、最終的な分類に有害な分解された画像を使用して分類モデルをトレーニングすることを回避します。特に、GraphJigsawは、分類モデル内にトップダウン方式でさまざまな段階で組み込むことができるため、学習した形状パターンを徐々に伝播することが容易になります。 GraphJigsawは、トレーニングプロセス中に追加の手動注釈に依存せず、推論時に追加の計算負荷を組み込みません。定量的および定性的な実験結果の両方で、提案されたGraphJigsawの実現可能性が検証されました。これは、2つの人気のある漫画の顔データセットで他の顔認識またはジグソーベースの方法を一貫して上回り、大幅に改善されています。

Cartoon face recognition is challenging as they typically have smooth color regions and emphasized edges, the key to recognize cartoon faces is to precisely perceive their sparse and critical shape patterns. However, it is quite difficult to learn a shape-oriented representation for cartoon face recognition with convolutional neural networks (CNNs). To mitigate this issue, we propose the GraphJigsaw that constructs jigsaw puzzles at various stages in the classification network and solves the puzzles with the graph convolutional network (GCN) in a progressive manner. Solving the puzzles requires the model to spot the shape patterns of the cartoon faces as the texture information is quite limited. The key idea of GraphJigsaw is constructing a jigsaw puzzle by randomly shuffling the intermediate convolutional feature maps in the spatial dimension and exploiting the GCN to reason and recover the correct layout of the jigsaw fragments in a self-supervised manner. The proposed GraphJigsaw avoids training the classification model with the deconstructed images that would introduce noisy patterns and are harmful for the final classification. Specially, GraphJigsaw can be incorporated at various stages in a top-down manner within the classification model, which facilitates propagating the learned shape patterns gradually. GraphJigsaw does not rely on any extra manual annotation during the training process and incorporates no extra computation burden at inference time. Both quantitative and qualitative experimental results have verified the feasibility of our proposed GraphJigsaw, which consistently outperforms other face recognition or jigsaw-based methods on two popular cartoon face datasets with considerable improvements.

updated: Wed Jul 14 2021 08:01:06 GMT+0000 (UTC)

published: Wed Jul 14 2021 08:01:06 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト