MEMO: Test Time Robustness via Adaptation and Augmentation

Marvin Zhang; Sergey Levine; Chelsea Finn

メモ：適応と増強によるテスト時間の堅牢性

ディープニューラルネットワークは、分布内のテストポイントで優れた精度を達成できますが、多くのアプリケーションでは、入力の予期しない摂動、ドメインの変化、またはその他の分布シフトの原因に直面しても、堅牢性が必要です。テスト時間のロバスト化の問題を研究します。つまり、モデルのロバスト性を改善するためにテスト入力を使用します。最近の先行研究では、テスト時間の適応方法が提案されていますが、それぞれが、複数のテストポイントへのアクセスなど、広範な採用を妨げる追加の仮定を導入しています。この作業では、モデルのトレーニングプロセスを前提とせず、テスト時に広く適用できる方法を研究および考案することを目指しています。モデルが確率的で適応性のある任意のテスト設定で使用できる単純なアプローチを提案します。テスト例が提示されたら、データポイントでさまざまなデータ拡張を実行し、最小化することでモデルパラメータ（すべて）を適応させます。拡張全体にわたるモデルの平均または限界出力分布のエントロピー。直感的に、この目的は、モデルが異なる拡張間で同じ予測を行うことを奨励し、したがって、予測の信頼性を維持しながら、これらの拡張でエンコードされた不変性を強制します。私たちの実験では、2つのベースラインResNetモデル、2つの堅牢なResNet-50モデル、および堅牢なビジョントランスフォーマーモデルを評価し、このアプローチが標準モデル評価よりも1〜8％の精度向上を達成し、以前の拡張よりも一般的に優れていることを示します。適応戦略。テストポイントが1つしかない設定では、ImageNet-C、ImageNet-R、およびResNet-50モデルの中でImageNet-A分布シフトベンチマークで最先端の結果を達成します。

While deep neural networks can attain good accuracy on in-distribution test points, many applications require robustness even in the face of unexpected perturbations in the input, changes in the domain, or other sources of distribution shift. We study the problem of test time robustification, i.e., using the test input to improve model robustness. Recent prior works have proposed methods for test time adaptation, however, they each introduce additional assumptions, such as access to multiple test points, that prevent widespread adoption. In this work, we aim to study and devise methods that make no assumptions about the model training process and are broadly applicable at test time. We propose a simple approach that can be used in any test setting where the model is probabilistic and adaptable: when presented with a test example, perform different data augmentations on the data point, and then adapt (all of) the model parameters by minimizing the entropy of the model's average, or marginal, output distribution across the augmentations. Intuitively, this objective encourages the model to make the same prediction across different augmentations, thus enforcing the invariances encoded in these augmentations, while also maintaining confidence in its predictions. In our experiments, we evaluate two baseline ResNet models, two robust ResNet-50 models, and a robust vision transformer model, and we demonstrate that this approach achieves accuracy gains of 1-8% over standard model evaluation and also generally outperforms prior augmentation and adaptation strategies. For the setting in which only one test point is available, we achieve state-of-the-art results on the ImageNet-C, ImageNet-R, and, among ResNet-50 models, ImageNet-A distribution shift benchmarks.

updated: Mon Jan 24 2022 07:04:47 GMT+0000 (UTC)

published: Mon Oct 18 2021 17:55:11 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト