Hierarchical Prototype Learning for Zero-Shot Recognition

Xingxing Zhang; Shupeng Gui; Zhenfeng Zhu; Yao Zhao; Ji Liu

ゼロショット認識のための階層的プロトタイプ学習

ゼロショットラーニング（ZSL）は、近年、特にきめの細かいオブジェクトの認識、検索、および画像キャプションの分野で大きな注目と成功を収めています。 ZSLの鍵は、補助的なセマンティックプロトタイプ（たとえば、単語または属性ベクトル）を介して、見えているクラスから見えないクラスに知識を転送することです。ただし、以前の作品で一般的に学んだ投影関数は、セマンティックプロトタイプに含まれる非視覚的なコンポーネントのために、一般化できません。さらに、提供されたプロトタイプとキャプチャされた画像の不完全さは、ZSLの最先端のアプローチではあまり考慮されていません。この論文では、ゼロショット認識のための体系的なソリューション（HPLという名前）を提供するための階層プロトタイプ学習定式化を提案します。具体的には、HPLは、視覚的プロトタイプをトランスダクティブ設定でそれぞれ学習することにより、見られたクラスドメインと見えないクラスドメインの両方で識別可能性を取得できます。 2つのドメインのギャップを狭めるために、視覚空間と意味空間の両方で解釈可能なスーパープロトタイプをさらに学習します。一方、2つのスペースは、構造の一貫性を最大化することでさらに橋渡しされます。これは、視覚プロトタイプの代表性を促進するだけでなく、セマンティックプロトタイプの情報の損失も軽減します。その後、広範な実験グループが慎重に設計および提示され、HPLがさまざまな設定で現在利用可能な代替手段よりも著しく有利な効率と有効性を獲得していることが実証されています。

Zero-Shot Learning (ZSL) has received extensive attention and successes in recent years especially in areas of fine-grained object recognition, retrieval, and image captioning. Key to ZSL is to transfer knowledge from the seen to the unseen classes via auxiliary semantic prototypes (e.g., word or attribute vectors). However, the popularly learned projection functions in previous works cannot generalize well due to non-visual components included in semantic prototypes. Besides, the incompleteness of provided prototypes and captured images has less been considered by the state-of-the-art approaches in ZSL. In this paper, we propose a hierarchical prototype learning formulation to provide a systematical solution (named HPL) for zero-shot recognition. Specifically, HPL is able to obtain discriminability on both seen and unseen class domains by learning visual prototypes respectively under the transductive setting. To narrow the gap of two domains, we further learn the interpretable super-prototypes in both visual and semantic spaces. Meanwhile, the two spaces are further bridged by maximizing their structural consistency. This not only facilitates the representativeness of visual prototypes, but also alleviates the loss of information of semantic prototypes. An extensive group of experiments are then carefully designed and presented, demonstrating that HPL obtains remarkably more favorable efficiency and effectiveness, over currently available alternatives under various settings.

updated: Tue Dec 10 2019 07:27:03 GMT+0000 (UTC)

published: Thu Oct 24 2019 07:26:30 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト