An Integrated Attribute Guided Dense Attention Model for Fine-Grained Generalized Zero-Shot Learning

Tasfia Shermin; Shyh Wei Teng; Ferdous Sohel; Manzur Murshed; Guojun Lu

細粒度の一般化されたゼロショット学習のための統合された属性ガイド付き高密度注意モデル

きめ細かい一般化ゼロショット学習（GZSL）タスクでは、満足のいくパフォーマンスを得るために、局所的な視覚的特徴と属性の間の関連性を調査して、細かい特徴的な情報を見つける必要があります。埋め込み学習と機能合成は、GZSLメソッドの一般的なカテゴリの2つです。ただし、これらの方法は、ローカル機能または属性からの直接ガイダンスのいずれかを無視するため、細かい識別情報を探索しません。その結果、それらはうまく機能しません。 2段階の密な注意メカニズムを備えた新しい埋め込み学習ネットワークを提案します。これは、直接属性監視を使用して、きめ細かいGZSLタスクの特徴的なローカル視覚機能を探索します。さらに、埋め込み学習ネットワークからの属性加重視覚機能を使用する機能合成ネットワークを組み込みます。両方のネットワークは、相互に有益な情報を活用するために、エンドツーエンドの方法で相互にトレーニングされています。その結果、提案された方法は両方のシナリオをテストできます：見えないクラスの画像のみが利用可能な場合（機能合成ネットワークを使用）、または見えないクラスの画像とセマンティック記述子の両方が利用可能な場合（埋め込み学習ネットワークを介して）。さらに、テスト中のソースドメインへのバイアスを減らすために、相互情報量に基づいてソースとターゲットのクラスの類似性を計算し、ターゲットクラスを転送学習します。提案された方法がベンチマークデータセットの現在の方法よりも優れていることを示します。

Fine-grained generalized zero-shot learning (GZSL) tasks require exploration of relevance between local visual features and attributes to discover fine distinctive information for satisfactory performance. Embedding learning and feature synthesizing are two of the popular categories of GZSL methods. However, these methods do not explore fine discriminative information as they ignore either the local features or direct guidance from the attributes. Consequently, they do not perform well. We propose a novel embedding learning network with a two-step dense attention mechanism, which uses direct attribute supervision to explore fine distinctive local visual features for fine-grained GZSL tasks. We further incorporate a feature synthesizing network, which uses the attribute-weighted visual features from the embedding learning network. Both networks are mutually trained in an end-to-end fashion to exploit mutually beneficial information. Consequently, the proposed method can test both scenarios: when only the images of unseen classes are available (using the feature synthesizing network) or when both images and semantic descriptors of the unseen classes are available (via the embedding learning network). Moreover, to reduce bias towards the source domain during testing, we compute source-target class similarity based on mutual information and transfer-learn the target classes. We demonstrate that our proposed method outperforms contemporary methods on benchmark datasets.

updated: Thu Dec 31 2020 21:38:46 GMT+0000 (UTC)

published: Thu Dec 31 2020 21:38:46 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト