Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales

Yifan Sun; Yuke Zhu; Yuhan Zhang; Pengkun Zheng; Xi Qiu; Chi Zhang; Yichen Wei

動的距離学習：複数の意味スケールに対応するためのスケーラブルな距離空間に向けて

このホワイトペーパーでは、実際のメトリックツールから深い視覚認識まで、新しい基本的な特性、つまりダイナミックレンジを紹介します。計測学では、ダイナミックレンジはメートル法ツールの基本的な品質であり、さまざまなスケールに対応できる柔軟性を示しています。ダイナミックレンジが大きいほど、柔軟性が高くなります。視覚認識では、マルチスケールの問題も存在します。視覚的概念が異なれば、意味スケールも異なる場合があります。たとえば、「動物」と「植物」の意味スケールは大きく、「エルク」の意味スケールははるかに小さくなっています。小さなセマンティックスケールでは、2つの異なるエルクは互いにかなり異なって見える場合があります。ただし、大きな意味スケール（動物や植物など）では、これら2つのエルクは類似していると測定する必要があります。異なる視覚的概念は実際に異なる意味スケールに対応するため、このような柔軟性は深いメトリック学習にとっても重要であると私たちは主張します。ダイナミックレンジをディープメトリックラーニングに導入すると、新しいコンピュータービジョンタスク、つまりダイナミックメトリックラーニングが得られます。これは、複数のセマンティックスケールにわたる視覚的概念に対応するためのスケーラブルな距離空間を学習することを目的としています。車両、動物、オンライン製品の3種類の画像に基づいて、動的メトリック学習用の3つのデータセットを構築します。これらのデータセットを一般的なディープメトリック学習方法でベンチマークし、動的メトリック学習は非常に困難であることがわかりました。主な難しさは、異なるスケール間の対立にあります。小さなスケールでの識別能力は、通常、大きなスケールでの識別能力を損ないます。逆もまた同様です。マイナーな貢献として、このような競合を軽減するためのクロススケール学習（CSL）を提案します。 CSLが3つのデータセットすべてのベースラインを一貫して改善することを示します。データセットとコードはhttps://github.com/SupetZYK/DynamicMetricLearningで公開されます。

This paper introduces a new fundamental characteristic, i.e. , the dynamic range, from real-world metric tools to deep visual recognition. In metrology, the dynamic range is a basic quality of a metric tool, indicating its flexibility to accommodate various scales. Larger dynamic range offers higher flexibility. In visual recognition, the multiple scale problem also exist. Different visual concepts may have different semantic scales. For example, ``Animal'' and ``Plants'' have a large semantic scale while ``Elk'' has a much smaller one. Under a small semantic scale, two different elks may look quite different to each other . However, under a large semantic scale (e.g. , animals and plants), these two elks should be measured as being similar. %We argue that such flexibility is also important for deep metric learning, because different visual concepts indeed correspond to different semantic scales. Introducing the dynamic range to deep metric learning, we get a novel computer vision task, i.e. , the Dynamic Metric Learning. It aims to learn a scalable metric space to accommodate visual concepts across multiple semantic scales. Based on three types of images, i.e., vehicle, animal and online products, we construct three datasets for Dynamic Metric Learning. We benchmark these datasets with popular deep metric learning methods and find Dynamic Metric Learning to be very challenging. The major difficulty lies in a conflict between different scales: the discriminative ability under a small scale usually compromises the discriminative ability under a large one, and vice versa. As a minor contribution, we propose Cross-Scale Learning (CSL) to alleviate such conflict. We show that CSL consistently improves the baseline on all the three datasets. The datasets and the code will be publicly available at https://github.com/SupetZYK/DynamicMetricLearning.

updated: Mon Mar 22 2021 12:46:12 GMT+0000 (UTC)

published: Mon Mar 22 2021 12:46:12 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト