SSD: A Unified Framework for Self-Supervised Outlier Detection

Vikash Sehwag; Mung Chiang; Prateek Mittal

SSD：自己監視外れ値検出のための統合フレームワーク

次の質問をします。効果的な異常/分布外（OOD）検出器を設計する、つまりトレーニング分布から遠く離れたサンプルを検出するには、どのようなトレーニング情報が必要ですか。ラベルのないデータは多くのアプリケーションで簡単にアクセスできるため、最も説得力のあるアプローチは、ラベルのない分布データのみに基づいて検出器を開発することです。ただし、ラベルのないデータに基づく既存の検出器のほとんどはパフォーマンスが低く、ランダムな予測と同等であることがよくあります。対照的に、既存の最先端のOOD検出器は優れたパフォーマンスを実現しますが、教師ありトレーニングのためにきめ細かいデータラベルにアクセスする必要があります。ラベルのない分布データのみに基づく外れ値検出器であるSSDを提案します。自己教師あり表現学習と、それに続く特徴空間でのマハラノビス距離ベースの検出を使用します。 SSDは、ラベルのないデータに基づく既存のほとんどの検出器よりも大幅に優れていることを示しています。さらに、SSDは、教師ありトレーニングベースの検出器を使用して、同等のパフォーマンスを実現し、場合によってはさらに優れたパフォーマンスを実現します。最後に、2つの主要な拡張機能を使用して検出フレームワークを拡張します。まず、数ショットのOOD検出を定式化します。この検出では、ターゲットのOODデータセットの各クラスから1〜5個のサンプルにしかアクセスできません。次に、フレームワークを拡張して、可能な場合はトレーニングデータラベルを組み込みます。 SSDに基づく新しい検出フレームワークは、これらの拡張機能によってパフォーマンスが向上し、最先端のパフォーマンスを実現していることがわかりました。私たちのコードはhttps://github.com/inspire-group/SSDで公開されています。

We ask the following question: what training information is required to design an effective outlier/out-of-distribution (OOD) detector, i.e., detecting samples that lie far away from the training distribution? Since unlabeled data is easily accessible for many applications, the most compelling approach is to develop detectors based on only unlabeled in-distribution data. However, we observe that most existing detectors based on unlabeled data perform poorly, often equivalent to a random prediction. In contrast, existing state-of-the-art OOD detectors achieve impressive performance but require access to fine-grained data labels for supervised training. We propose SSD, an outlier detector based on only unlabeled in-distribution data. We use self-supervised representation learning followed by a Mahalanobis distance based detection in the feature space. We demonstrate that SSD outperforms most existing detectors based on unlabeled data by a large margin. Additionally, SSD even achieves performance on par, and sometimes even better, with supervised training based detectors. Finally, we expand our detection framework with two key extensions. First, we formulate few-shot OOD detection, in which the detector has access to only one to five samples from each class of the targeted OOD dataset. Second, we extend our framework to incorporate training data labels, if available. We find that our novel detection framework based on SSD displays enhanced performance with these extensions, and achieves state-of-the-art performance. Our code is publicly available at https://github.com/inspire-group/SSD.

updated: Mon Mar 22 2021 17:51:35 GMT+0000 (UTC)

published: Mon Mar 22 2021 17:51:35 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト