Adversarial robustness of VAEs through the lens of local geometry

Asif Khan; Amos Storkey

局所幾何学のレンズを通して見た VAE の敵対的ロバスト性

変分オートエンコーダー (VAE) に対する教師なし攻撃では、敵対者は入力サンプルに潜在空間エンコーディングを大幅に変更する小さな摂動を見つけ、それによって固定デコーダーの再構築を危険にさらします。このような脆弱性の既知の理由は、近似された潜在事後分布と事前分布の間の不一致に起因する潜在空間の歪みです。その結果、入力サンプルのわずかな変更により、そのエンコードが潜在空間の低/ゼロ密度領域に移動し、制約のない生成が発生する可能性があります。この論文では、攻撃者が VAE を攻撃するための最適な方法は、エンコーダーおよびデコーダーネットワークによって誘発される確率的プルバックメトリックテンソルの方向性バイアスを悪用することであることを示しています。エンコーダーのプルバックメトリックテンソルは、入力から潜在空間への微小な潜在ボリュームの変化を測定します。したがって、潜在空間の歪みにつながる入力摂動の影響を分析するためのレンズと見なすことができます。プルバックメトリックテンソルの固有スペクトルを使用してロバスト性評価スコアを提案します。さらに、スコアがβ-VAEのロバストネスパラメーターβと相関することを経験的に示しています。 β を大きくすると再構成の品質も低下するため、混合トレーニングを使用して潜在空間の空の領域を埋める単純な代替手段を示し、再構成を改善してロバスト性を向上させます。

In an unsupervised attack on variational autoencoders (VAEs), an adversary finds a small perturbation in an input sample that significantly changes its latent space encoding, thereby compromising the reconstruction for a fixed decoder. A known reason for such vulnerability is the distortions in the latent space resulting from a mismatch between approximated latent posterior and a prior distribution. Consequently, a slight change in an input sample can move its encoding to a low/zero density region in the latent space resulting in an unconstrained generation. This paper demonstrates that an optimal way for an adversary to attack VAEs is to exploit a directional bias of a stochastic pullback metric tensor induced by the encoder and decoder networks. The pullback metric tensor of an encoder measures the change in infinitesimal latent volume from an input to a latent space. Thus, it can be viewed as a lens to analyse the effect of input perturbations leading to latent space distortions. We propose robustness evaluation scores using the eigenspectrum of a pullback metric tensor. Moreover, we empirically show that the scores correlate with the robustness parameter β of the β-VAE. Since increasing β also degrades reconstruction quality, we demonstrate a simple alternative using mixup training to fill the empty regions in the latent space, thus improving robustness with improved reconstruction.

updated: Mon Oct 28 2024 16:01:39 GMT+0000 (UTC)

published: Mon Aug 08 2022 05:53:57 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト