A Benchmark and Asymmetrical-Similarity Learning for Practical Image Copy Detection

Wenhao Wang; Yifan Sun; Yi Yang

実用的な画像コピー検出のためのベンチマークと非対称類似性学習

画像コピー検出（ICD）は、クエリ画像が参照セットからの画像の編集済みコピーであるかどうかを判断することを目的としています。現在、ICDの公開ベンチマークは非常に限られていますが、実際のアプリケーションでの重大な課題、つまりハードネガティブクエリからの注意散漫を見落としています。具体的には、一部のクエリは編集されたコピーではありませんが、本質的に一部の参照画像に類似しています。これらのハードネガティブクエリは、編集されたコピーとして簡単に誤認識され、ICDの精度が大幅に低下します。この観察結果は、この特性を備えた最初のICDベンチマークを構築する動機になります。このホワイトペーパーでは、既存のICDデータセットに基づいて、トレーニングセットとテストセットにそれぞれ100、000と24、252のハードネガティブペアを追加することにより、新しいデータセットを構築します。さらに、この論文は、ICDのハードネガティブな問題を解決するための独特の難しさをさらに明らかにします。つまり、現在の計量学習とICDの間に根本的な対立があります。この競合は次のとおりです。メトリック学習は対称距離を採用しますが、編集されたコピーは非対称（一方向）プロセスです。たとえば、部分的なトリミングは全体的な参照画像に近く、編集されたコピーですが、後者は編集されたコピーではありません。前者（距離は同じくらい小さいにもかかわらず）。この洞察により、非対称類似性学習（ASL）メソッドが生成されます。これにより、2つの方向（クエリ<->参照画像）の類似性を互いに異ならせることができます。実験結果は、ASLが最先端の方法を明確に上回っていることを示しており、対称非対称の競合を解決することがICDにとって重要であることを確認しています。

Image copy detection (ICD) aims to determine whether a query image is an edited copy of any image from a reference set. Currently, there are very limited public benchmarks for ICD, while all overlook a critical challenge in real-world applications, i.e., the distraction from hard negative queries. Specifically, some queries are not edited copies but are inherently similar to some reference images. These hard negative queries are easily false recognized as edited copies, significantly compromising the ICD accuracy. This observation motivates us to build the first ICD benchmark featuring this characteristic. Based on existing ICD datasets, this paper constructs a new dataset by additionally adding 100, 000 and 24, 252 hard negative pairs into the training and test set, respectively. Moreover, this paper further reveals a unique difficulty for solving the hard negative problem in ICD, i.e., there is a fundamental conflict between current metric learning and ICD. This conflict is: the metric learning adopts symmetric distance while the edited copy is an asymmetric (unidirectional) process, e.g., a partial crop is close to its holistic reference image and is an edited copy, while the latter cannot be the edited copy of the former (in spite the distance is equally small). This insight results in an Asymmetrical-Similarity Learning (ASL) method, which allows the similarity in two directions (the query <-> the reference image) to be different from each other. Experimental results show that ASL outperforms state-of-the-art methods by a clear margin, confirming that solving the symmetric-asymmetric conflict is critical for ICD.

updated: Wed Aug 17 2022 07:52:11 GMT+0000 (UTC)

published: Tue May 24 2022 20:39:11 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト