On visual self-supervision and its effect on model robustness

Michal Kucer; Diane Oyen; Garrett Kenyon

視覚的な自己監視とそのモデルの堅牢性への影響について

最近の自己監視方法は、完全な監視に匹敵する可能性のある特徴表現の学習に成功しており、モデルの堅牢性の向上や分布外の検出など、いくつかの点でモデルに有益であることが示されています。私たちの論文では、事前トレーニング手法または敵対的トレーニングの一部としての自己監視学習が、l_2およびl_∞の敵対的摂動および自然画像の破損に対するモデルの堅牢性にどのように影響するかをより正確に理解するための実証的研究を実施します。自己監視は確かにモデルの堅牢性を向上させることができますが、悪魔は細部に宿っています。敵対的訓練と並行して自己監視損失を単純に追加すると、ロバストモデルが訓練されたϵ_trainの値よりも小さいか同等の敵対的摂動で評価されたときにモデルの精度が向上することがわかります。ただし、ϵ_test≥ϵ_trainの精度を観察すると、モデルの精度が低下します。実際、監視損失の重みが大きいほど、パフォーマンスの低下が大きくなります。つまり、モデルの堅牢性が損なわれます。自己監視を敵対者のトレーニングに追加できる主な方法を特定し、自己監視型損失を使用して両方のネットワークパラメータを最適化し、敵対者の例を見つけると、モデルの堅牢性が最も大幅に向上することを確認します。アンサンブルの敵対的訓練の形式。自己監視された事前トレーニングは、ランダムな重みの初期化と比較して、敵対的なトレーニングを改善する利点をもたらしますが、自己監視が敵対的なトレーニングに組み込まれている場合、モデルの堅牢性や精度に利点はありません。

Recent self-supervision methods have found success in learning feature representations that could rival ones from full supervision, and have been shown to be beneficial to the model in several ways: for example improving models robustness and out-of-distribution detection. In our paper, we conduct an empirical study to understand more precisely in what way can self-supervised learning - as a pre-training technique or part of adversarial training - affects model robustness to l_2 and l_∞ adversarial perturbations and natural image corruptions. Self-supervision can indeed improve model robustness, however it turns out the devil is in the details. If one simply adds self-supervision loss in tandem with adversarial training, then one sees improvement in accuracy of the model when evaluated with adversarial perturbations smaller or comparable to the value of ϵ_train that the robust model is trained with. However, if one observes the accuracy for ϵ_test ≥ϵ_train, the model accuracy drops. In fact, the larger the weight of the supervision loss, the larger the drop in performance, i.e. harming the robustness of the model. We identify primary ways in which self-supervision can be added to adversarial training, and observe that using a self-supervised loss to optimize both network parameters and find adversarial examples leads to the strongest improvement in model robustness, as this can be viewed as a form of ensemble adversarial training. Although self-supervised pre-training yields benefits in improving adversarial training as compared to random weight initialization, we observe no benefit in model robustness or accuracy if self-supervision is incorporated into adversarial training.

updated: Wed Dec 08 2021 16:22:02 GMT+0000 (UTC)

published: Wed Dec 08 2021 16:22:02 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト