Deep Reinforced Attention Regression for Partial Sketch Based Image Retrieval

Dingrong Wang; Hitesh Sapkota; Xumin Liu; Qi Yu

部分的なスケッチベースの画像検索のための深い強化された注意回帰

きめ細かいスケッチベースの画像検索（FG-SBIR）は、クエリスケッチを指定して大きなギャラリーから特定の画像を見つけることを目的としています。多くの重要な領域（犯罪活動の追跡など）でFG-SBIRが広く適用されているにもかかわらず、既存のアプローチは、スケッチ内の不要なストロークなどの外部ノイズに敏感である一方で、精度が低いという問題があります。対応する画像を取得するために使用できるのは、わずかな（ノイズの多い）ストロークのみの部分的に完全なスケッチのみである、より実用的なオンザフライ設定では、取得パフォーマンスはさらに低下します。部分的なスケッチトレーニングと注意領域の選択を処理するためにデュアルレベルの探索を実行する独自に設計された深層強化学習モデルを活用する新しいフレームワークを提案します。元のスケッチの重要な領域にモデルの注意を強制することにより、モデルは不要なストロークノイズに対して堅牢なままであり、検索精度を大幅に向上させます。部分的なスケッチを十分に探索し、参加する重要な領域を特定するために、モデルは、ローカル探索のロケーターネットワークを管理する標準偏差項を調整しながら、グローバル探索のブートストラップポリシー勾配を実行します。トレーニングプロセスは、強化損失と教師あり損失を統合するハイブリッド損失によって導かれます。動的ランキング報酬は、部分的なスケッチを使用したオンザフライの画像検索プロセスに適合するように開発されています。 3つの公開データセットで実行された広範な実験は、提案されたアプローチが部分的なスケッチベースの画像検索で最先端のパフォーマンスを達成することを示しています。

Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) aims at finding a specific image from a large gallery given a query sketch. Despite the widespread applicability of FG-SBIR in many critical domains (e.g., crime activity tracking), existing approaches still suffer from a low accuracy while being sensitive to external noises such as unnecessary strokes in the sketch. The retrieval performance will further deteriorate under a more practical on-the-fly setting, where only a partially complete sketch with only a few (noisy) strokes are available to retrieve corresponding images. We propose a novel framework that leverages a uniquely designed deep reinforcement learning model that performs a dual-level exploration to deal with partial sketch training and attention region selection. By enforcing the model's attention on the important regions of the original sketches, it remains robust to unnecessary stroke noises and improve the retrieval accuracy by a large margin. To sufficiently explore partial sketches and locate the important regions to attend, the model performs bootstrapped policy gradient for global exploration while adjusting a standard deviation term that governs a locator network for local exploration. The training process is guided by a hybrid loss that integrates a reinforcement loss and a supervised loss. A dynamic ranking reward is developed to fit the on-the-fly image retrieval process using partial sketches. The extensive experimentation performed on three public datasets shows that our proposed approach achieves the state-of-the-art performance on partial sketch based image retrieval.

updated: Sun Nov 21 2021 23:12:51 GMT+0000 (UTC)

published: Sun Nov 21 2021 23:12:51 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト