It Takes Two to Tango: Mixup for Deep Metric Learning

Shashanka Venkataramanan; Bill Psomas; Ewa Kijak; Laurent Amsaleg; Konstantinos Karantzalos; Yannis Avrithis

タンゴには2つかかります：深い計量学習のための取り違え

計量学習では、類似したクラスの埋め込みが近くにあるように促され、異なるクラスの埋め込みがはるかに離れるように、識別表現を学習します。最先端の方法は、主に高度な損失関数またはマイニング戦略に焦点を当てています。一方では、計量学習の損失は、一度に2つ以上の例を考慮します。一方、分類のための最新のデータ拡張方法では、一度に2つ以上の例を検討します。 2つのアイデアの組み合わせは十分に研究されていません。この作業では、このギャップを埋め、ミックスアップを使用して表現を改善することを目指しています。これは、一度に2つ以上の例と対応するターゲットラベルを補間する強力なデータ拡張アプローチです。分類とは異なり、メトリック学習で使用される損失関数は例に対して加算的ではないため、このタスクは困難です。したがって、ターゲットラベルを内挿するという考え方は単純ではありません。私たちの知る限りでは、私たちは深いメトリック学習のために例とターゲットラベルの両方を混合することを調査する最初の人です。既存のメトリック学習損失関数を含む一般化された定式化を開発し、ミックスアップ、メトリックミックス、またはMetrixの導入に対応するように修正します。また、トレーニング中に例を混合することで、トレーニングクラス以外の埋め込みスペースの領域を探索し、それによって表現を改善していることを示すために、新しいメトリックである使用率を紹介します。改善された表現の効果を検証するために、入力、中間表現、または埋め込みをターゲットラベルと混合すると、4つのベンチマークディープメトリック学習データセットの最先端のメトリック学習方法よりも大幅に優れていることを示します。

Metric learning involves learning a discriminative representation such that embeddings of similar classes are encouraged to be close, while embeddings of dissimilar classes are pushed far apart. State-of-the-art methods focus mostly on sophisticated loss functions or mining strategies. On the one hand, metric learning losses consider two or more examples at a time. On the other hand, modern data augmentation methods for classification consider two or more examples at a time. The combination of the two ideas is under-studied. In this work, we aim to bridge this gap and improve representations using mixup, which is a powerful data augmentation approach interpolating two or more examples and corresponding target labels at a time. This task is challenging because unlike classification, the loss functions used in metric learning are not additive over examples, so the idea of interpolating target labels is not straightforward. To the best of our knowledge, we are the first to investigate mixing both examples and target labels for deep metric learning. We develop a generalized formulation that encompasses existing metric learning loss functions and modify it to accommodate for mixup, introducing Metric Mix, or Metrix. We also introduce a new metric - utilization, to demonstrate that by mixing examples during training, we are exploring areas of the embedding space beyond the training classes, thereby improving representations. To validate the effect of improved representations, we show that mixing inputs, intermediate representations or embeddings along with target labels significantly outperforms state-of-the-art metric learning methods on four benchmark deep metric learning datasets.

updated: Mon Feb 28 2022 08:23:57 GMT+0000 (UTC)

published: Wed Jun 09 2021 11:20:03 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト