End-to-end One-shot Human Parsing

Haoyu He; Jing Zhang; Bohan Zhuang; Jianfei Cai; Dacheng Tao

エンドツーエンドのワンショット人間解析

以前の人間の解析モデルは、人間を事前定義されたクラスに解析することに限定されていました。これは、新しいクラスを処理する必要があるアプリケーションには柔軟性がありません。このホワイトペーパーでは、テスト例で定義されたクラスのオープンセットに人間を解析する必要がある新しいワンショット人間解析（OSHP）タスクを定義します。トレーニング中は、基本クラスのみが公開され、テスト時間クラスの一部とのみ重複します。 OSHPの3つの主要な課題、つまり、小さいサイズ、テストバイアス、および同様の部分に対処するために、新しいエンドツーエンドのワンショット人間解析ネットワーク（EOP-Net）を考案します。最初に、エンドツーエンドの人間解析フレームワークが提案され、異なる粒度で意味情報を相互に共有し、小さなサイズの人間クラスの認識を支援します。次に、2つの協調的なメトリック学習モジュールを考案して、基本クラスの代表的なプロトタイプを学習します。これにより、目に見えないクラスにすばやく適応し、テストのバイアスを軽減できます。さらに、ロバストなプロトタイプが新しい概念へのより高い転送可能性で特徴表現を強化することを経験的に発見しました。したがって、トレーニング時間のプロトタイプを徐々に平滑化することによって生成された運動量更新動的プロトタイプを採用し、プロトタイプレベルで対照的な損失を採用することを提案します。 OSHP用に調整された3つの人気のあるベンチマークでの実験は、EOP-Netが代表的なワンショットセグメンテーションモデルを大幅に上回っていることを示しています。これは、この新しいタスクに関するさらなる研究の強力なベンチマークとして機能します。ソースコードは公開されます。

Previous human parsing models are limited to parsing humans into pre-defined classes, which is inflexible for applications that need to handle new classes. In this paper, we define a new one-shot human parsing (OSHP) task that requires parsing humans into an open set of classes defined by any test example. During training, only base classes are exposed, which only overlap with part of test-time classes. To address three main challenges in OSHP, i.e., small sizes, testing bias, and similar parts, we devise a novel End-to-end One-shot human Parsing Network (EOP-Net). Firstly, an end-to-end human parsing framework is proposed to mutually share semantic information with different granularities and help recognize the small-size human classes. Then, we devise two collaborative metric learning modules to learn representative prototypes for base classes, which can quickly adapt to unseen classes and mitigate the testing bias. Moreover, we empirically find that robust prototypes empower feature representations with higher transferability to the novel concepts, hence, we propose to adopt momentum-updated dynamic prototypes generated by gradually smoothing the training time prototypes and employ contrastive loss at the prototype level. Experiments on three popular benchmarks tailored for OSHP demonstrate that EOP-Net outperforms representative one-shot segmentation models by large margins, which serves as a strong benchmark for further research on this new task. The source code will be made publicly available.

updated: Tue May 04 2021 01:35:50 GMT+0000 (UTC)

published: Tue May 04 2021 01:35:50 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト