Accelerated Zeroth-Order and First-Order Momentum Methods from Mini to Minimax Optimization

Feihu Huang; Shangqian Gao; Jian Pei; Heng Huang

ミニからミニマックス最適化までの加速されたゼロ次および一次運動量法

この論文では、非凸ミニ最適化とミニマックス最適化の両方のために、加速されたゼロ次および一次運動量法のクラスを提案します。具体的には、確率的ミニ最適化問題を解くために、新しい加速ゼロ次運動量（Acc-ZOM）法を提案します。 Acc-ZOM法が、ϵ-停留点を見つけるためにO（d ^ 3 / 4ϵ ^ -3）のより低いクエリ複雑度を達成し、O（d ^ 1/4 ）ここで、dはパラメーターの次元を示します。特に、Acc-ZOMは、既存の0次確率的アルゴリズムで必要とされる大きなバッチを必要としません。同時に、ブラックボックスのミニマックス最適化のための加速ゼロ次運動量降下上昇（Acc-ZOMDA）法を提案します。 Acc-ZOMDA法が、ϵ停留点を見つけるためにO（（d_1 + d_2）^ 9 /10κ_y^ 3ϵ ^ -3）のより低いクエリ複雑度に到達することを証明します。これにより、最もよく知られている結果がOの係数で改善されます。（（d_1 + d_2）^ 1/10）ここで、d_1とd_2は最適化パラメーターの次元を示し、κ_yは条件数です。さらに、ホワイトボックスミニマックス問題を解くための加速一次運動量降下上昇（Acc-MDA）法を提案し、O（κ_y^（3-ν/ 2）ϵ ^-のより低い勾配複雑性を達成することを証明します。 3）ϵ停留点を見つけるためにν> 0を使用します。これにより、最もよく知られている結果がO（κ_y^ν/ 2）の係数で改善されます。ディープニューラルネットワーク（DNN）へのブラックボックスの敵対的攻撃と中毒攻撃に関する広範な実験結果は、アルゴリズムの効率を示しています。

In the paper, we propose a class of accelerated zeroth-order and first-order momentum methods for both nonconvex mini-optimization and minimax-optimization. Specifically, we propose a new accelerated zeroth-order momentum (Acc-ZOM) method to solve stochastic mini-optimization problems. We prove that the Acc-ZOM method achieves a lower query complexity of O(d^3/4ϵ^-3) for finding an ϵ-stationary point, which improves the best known result by a factor of O(d^1/4) where d denotes the parameter dimension. In particular, the Acc-ZOM does not require large batches required in the existing zeroth-order stochastic algorithms. At the same time, we propose an accelerated zeroth-order momentum descent ascent (Acc-ZOMDA) method for black-box minimax-optimization. We prove that the Acc-ZOMDA method reaches a lower query complexity of O((d_1+d_2)^9/10κ_y^3ϵ^-3) for finding an ϵ-stationary point, which improves the best known result by a factor of O((d_1+d_2)^1/10) where d_1 and d_2 denote dimensions of optimization parameters and κ_y is condition number. Moreover, we propose an accelerated first-order momentum descent ascent (Acc-MDA) method for solving white-box minimax problems, and prove that it achieves a lower gradient complexity of O(κ_y^(3-ν/2)ϵ^-3) with ν>0 for finding an ϵ-stationary point, which improves the best known result by a factor of O(κ_y^ν/2). Extensive experimental results on the black-box adversarial attack to deep neural networks (DNNs) and poisoning attack demonstrate the efficiency of our algorithms.

updated: Mon Mar 01 2021 02:33:46 GMT+0000 (UTC)

published: Tue Aug 18 2020 22:19:29 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト