Data Invariants to Understand Unsupervised Out-of-Distribution Detection

Lars Doorenbos; Raphael Sznitman; Pablo Márquez-Neila

教師なし分布外検出を理解するためのデータ不変量

教師なし分布外（U-OOD）検出は、ミッションクリティカルなシステムでの重要性と、監視対象のシステムよりも幅広い適用性により、最近大きな注目を集めています。この注目の高まりにもかかわらず、U-OODメソッドには重要な欠点があります。さまざまなベンチマークと画像モダリティで大規模な評価を実行することにより、この作業では、最も一般的な最先端の方法が、マハラノビス距離（MahaAD）に基づく単純で比較的未知の異常検出器を一貫して上回ることができないことを示します。。これらの方法の不一致の主な理由は、U-OODの正式な説明がないことです。簡単な思考実験に動機付けられて、トレーニングデータセットの不変量に基づいてU-OODの特性評価を提案します。この特性評価が無意識のうちに最高得点のMahaADメソッドでどのように具体化されているかを示し、それによってその品質を説明します。さらに、私たちのアプローチは、U-OOD検出器の予測を解釈するために使用でき、将来のU-OODメソッドを評価するためのグッドプラクティスへの洞察を提供します。

Unsupervised out-of-distribution (U-OOD) detection has recently attracted much attention due its importance in mission-critical systems and broader applicability over its supervised counterpart. Despite this increase in attention, U-OOD methods suffer from important shortcomings. By performing a large-scale evaluation on different benchmarks and image modalities, we show in this work that most popular state-of-the-art methods are unable to consistently outperform a simple and relatively unknown anomaly detector based on the Mahalanobis distance (MahaAD). A key reason for the inconsistencies of these methods is the lack of a formal description of U-OOD. Motivated by a simple thought experiment, we propose a characterization of U-OOD based on the invariants of the training dataset. We show how this characterization is unknowingly embodied in the top-scoring MahaAD method, thereby explaining its quality. Furthermore, our approach can be used to interpret predictions of U-OOD detectors and provides insights into good practices for evaluating future U-OOD methods.

updated: Fri Nov 26 2021 08:42:56 GMT+0000 (UTC)

published: Fri Nov 26 2021 08:42:56 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト