Deep Variational Semi-Supervised Novelty Detection

Tal Daniel; Thanard Kurutach; Aviv Tamar

深い変分半教師ありノベルティ検出

異常検出（AD）では、正常なサンプルのデータセットが与えられた場合に、テストサンプルが異常であるかどうかを識別しようとします。 ADに対する最近の有望なアプローチは、正規データ分布の教師なし学習のために、変分オートエンコーダー（VAE）などの深い生成モデルに依存しています。半教師ありAD（SSAD）では、データにはラベル付き異常の小さなサンプルも含まれます。この作業では、SSADのVAEをトレーニングするための2つの変分方法を提案します。両方の方法の直感的なアイデアは、通常のデータと外れ値のデータの潜在的なベクトルを「分離」するようにエンコーダーをトレーニングすることです。このアイデアが問題の原理的な確率的定式化から導き出せることを示し、単純で効果的なアルゴリズムを提案します。自然画像から天文学や医学に至るまでのSSADデータセットで示すように、私たちの方法はさまざまなデータタイプに適用でき、任意のVAEモデルアーキテクチャと組み合わせることができ、アンサンブルと自然に互換性があります。特定のデータタイプに固有ではない最先端のSSADメソッドと比較すると、外れ値の検出が大幅に改善されています。

In anomaly detection (AD), one seeks to identify whether a test sample is abnormal, given a data set of normal samples. A recent and promising approach to AD relies on deep generative models, such as variational autoencoders (VAEs), for unsupervised learning of the normal data distribution. In semi-supervised AD (SSAD), the data also includes a small sample of labeled anomalies. In this work, we propose two variational methods for training VAEs for SSAD. The intuitive idea in both methods is to train the encoder to `separate' between latent vectors for normal and outlier data. We show that this idea can be derived from principled probabilistic formulations of the problem, and propose simple and effective algorithms. Our methods can be applied to various data types, as we demonstrate on SSAD datasets ranging from natural images to astronomy and medicine, can be combined with any VAE model architecture, and are naturally compatible with ensembling. When comparing to state-of-the-art SSAD methods that are not specific to particular data types, we obtain marked improvement in outlier detection.

updated: Sat May 22 2021 08:52:08 GMT+0000 (UTC)

published: Tue Nov 12 2019 16:03:50 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト