Learning Test-time Data Augmentation for Image Retrieval with Reinforcement Learning

Osman Tursun; Simon Denman; Sridha Sridharan; Clinton Fookes

強化学習による画像検索のためのテスト時データ拡張の学習

既成の畳み込みニューラルネットワーク機能は、多くの画像検索タスクで卓越した結果を達成します。ただし、それらの不変性は、ネットワークアーキテクチャとトレーニングデータによって事前に定義されています。既存の画像検索アプローチでは、ターゲットデータの変動に適応するために、事前にトレーニングされたネットワークを微調整または変更する必要があります。対照的に、私たちの方法は、学習したテスト時間の拡張で拡張された画像から抽出された特徴を集約することにより、既成の機能の不変性を強化します。テスト時間拡張の最適なアンサンブルは、強化学習を通じて自動的に学習されます。私たちのトレーニングは時間とリソースの効率が高く、さまざまなテスト時間の増加を学習します。商標検索（METU商標データセット）およびランドマーク検索（Oxford5kおよびParis6kシーンデータセット）タスクの実験結果は、学習した変換のアンサンブルが効果的で転送可能であることを示しています。また、METU商標データセットで最先端のMAP @ 100の結果を達成しています。

Off-the-shelf convolutional neural network features achieve outstanding results in many image retrieval tasks. However, their invariance is pre-defined by the network architecture and training data. Existing image retrieval approaches require fine-tuning or modification of the pre-trained networks to adapt to the variations in the target data. In contrast, our method enhances the invariance of off-the-shelf features by aggregating features extracted from images augmented with learned test-time augmentations. The optimal ensemble of test-time augmentations is learned automatically through reinforcement learning. Our training is time and resources efficient, and learns a diverse test-time augmentations. Experiment results on trademark retrieval (METU trademark dataset) and landmark retrieval (Oxford5k and Paris6k scene datasets) tasks show the learned ensemble of transformations is effective and transferable. We also achieve state-of-the-art MAP@100 results on the METU trademark dataset.

updated: Mon Feb 08 2021 06:32:56 GMT+0000 (UTC)

published: Wed Feb 05 2020 05:08:41 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト