Graph Convolution Based Efficient Re-Ranking for Visual Retrieval

Yuqi Zhang; Qi Qian; Hongsong Wang; Chong Liu; Weihua Chen; Fan Wang

視覚的な検索のためのグラフ畳み込みベースの効率的な再ランキング

画像検索や人物再識別 (Re-ID) などの視覚的検索タスクは、類似した内容または同一の身元を持つ画像を効果的かつ徹底的に検索することを目的としています。検索されたサンプルを取得した後、意味的に隣接するサンプルからのコンテキスト情報を利用して、最初の検索結果を並べ替えて改善するために、再ランキングは広く採用されている後処理ステップです。一般的な再ランキング手法は、距離メトリックを更新し、拡張近傍ベースの距離を計算する際の非効率的なクロスチェックセット比較演算に主に依存しています。この研究では、特徴を更新することによって最初の検索結果を絞り込む、効率的な再ランキング方法を紹介します。具体的には、グラフ畳み込みネットワーク (GCN) に基づいて再ランキングを再定式化し、特徴伝播による視覚的検索タスク用の新しいグラフ畳み込みベースの再ランキング (GCR) を提案します。大規模な検索の計算を高速化するために、並列または分散コンピューティングをサポートする分散型および同期の特徴伝播アルゴリズムが導入されています。特に、プレーン GCR はカメラ間の検索用に拡張されており、異なるカメラ間の類似性関係を活用するために改良された特徴伝播定式化が提示されています。これはビデオベースの検索にも拡張されており、トラックレットの新しいプロファイルベクトル生成方法を数学的に導出することにより、Graph Convolution based Re-ranking for Video (GCRV) が提案されています。追加機能なしで、提案されたアプローチは、画像検索、人物 Re-ID、およびビデオベースの人物 Re-ID という 3 つの異なるタスクからの 7 つのベンチマークデータセットで最先端のパフォーマンスを達成します。

Visual retrieval tasks such as image retrieval and person re-identification (Re-ID) aim at effectively and thoroughly searching images with similar content or the same identity. After obtaining retrieved examples, re-ranking is a widely adopted post-processing step to reorder and improve the initial retrieval results by making use of the contextual information from semantically neighboring samples. Prevailing re-ranking approaches update distance metrics and mostly rely on inefficient crosscheck set comparison operations while computing expanded neighbors based distances. In this work, we present an efficient re-ranking method which refines initial retrieval results by updating features. Specifically, we reformulate re-ranking based on Graph Convolution Networks (GCN) and propose a novel Graph Convolution based Re-ranking (GCR) for visual retrieval tasks via feature propagation. To accelerate computation for large-scale retrieval, a decentralized and synchronous feature propagation algorithm which supports parallel or distributed computing is introduced. In particular, the plain GCR is extended for cross-camera retrieval and an improved feature propagation formulation is presented to leverage affinity relationships across different cameras. It is also extended for video-based retrieval, and Graph Convolution based Re-ranking for Video (GCRV) is proposed by mathematically deriving a novel profile vector generation method for the tracklet. Without bells and whistles, the proposed approaches achieve state-of-the-art performances on seven benchmark datasets from three different tasks, i.e., image retrieval, person Re-ID and video-based person Re-ID.

updated: Thu Jun 15 2023 00:28:08 GMT+0000 (UTC)

published: Thu Jun 15 2023 00:28:08 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト