Steerable Equivariant Representation Learning

Sangnie Bhardwaj; Willie McClinton; Tongzhou Wang; Guillaume Lajoie; Chen Sun; Phillip Isola; Dilip Krishnan

操縦可能な同変表現学習

事前トレーニング済みの深層画像表現は、転移学習による分類、画像検索、オブジェクト検出などのトレーニング後のタスクに役立ちます。データ拡張は、教師あり設定と自己教師あり設定の両方で堅牢な表現を事前トレーニングするための重要な側面です。データ拡張は、埋め込み空間の不変性を入力画像変換に明示的または暗黙的に促進します。この不変性により、これらの特定のデータ拡張に対する感度に依存するダウンストリームタスクへの一般化が減少します。この論文では、代わりにデータ拡張と同変である表現を学習する方法を提案します。操縦可能な表現を使用することで、この等分散性を実現します。私たちの表現は、学習した線形マップを介して埋め込み空間で直接操作できます。結果として得られる操縦可能で同変の表現が、転移学習とロバスト性のパフォーマンスを向上させることを実証します。たとえば、線形プローブのトップ 1 精度を転移で 1% から 3% 改善します。 ImageNet-C の精度は最大 3.4% 向上します。さらに、表現の操縦性がテスト時間の拡張に大幅なスピードアップ (ほぼ 50 倍) を提供することを示します。分布外検出に多数の拡張を適用することにより、不変表現よりも ImageNet-C データセットの OOD AUC を大幅に改善します。

Pre-trained deep image representations are useful for post-training tasks such as classification through transfer learning, image retrieval, and object detection. Data augmentations are a crucial aspect of pre-training robust representations in both supervised and self-supervised settings. Data augmentations explicitly or implicitly promote invariance in the embedding space to the input image transformations. This invariance reduces generalization to those downstream tasks which rely on sensitivity to these particular data augmentations. In this paper, we propose a method of learning representations that are instead equivariant to data augmentations. We achieve this equivariance through the use of steerable representations. Our representations can be manipulated directly in embedding space via learned linear maps. We demonstrate that our resulting steerable and equivariant representations lead to better performance on transfer learning and robustness: e.g. we improve linear probe top-1 accuracy by between 1% to 3% for transfer; and ImageNet-C accuracy by upto 3.4%. We further show that the steerability of our representations provides significant speedup (nearly 50x) for test-time augmentations; by applying a large number of augmentations for out-of-distribution detection, we significantly improve OOD AUC on the ImageNet-C dataset over an invariant representation.

updated: Wed Feb 22 2023 12:42:45 GMT+0000 (UTC)

published: Wed Feb 22 2023 12:42:45 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト