On the Adversarial Robustness of Mixture of Experts

Joan Puigcerver; Rodolphe Jenatton; Carlos Riquelme; Pranjal Awasthi; Srinadh Bhojanapalli

専門家の混合の敵対的ロバスト性について

敵対的ロバスト性は、ニューラルネットワークの重要な望ましい特性です。サイズの影響を受けることが経験的に示されており、通常、大規模なネットワークはより堅牢です。最近、Bubeck と Sellke は、パラメーターの数に関してトレーニングデータに適合する関数のリプシッツ定数の下限を証明しました。これにより、興味深い未解決の問題が生じます。より多くのパラメーターを持つが、必ずしもより多くの計算コストを必要としない関数は、より優れたロバスト性を持ちますか?ほぼ一定の計算コストでモデルサイズをスケールアップすることを可能にするスパース混合エキスパートモデル (MoE) について、この問題を調査します。ルーティングとデータの構造に関する特定の条件下では、MoE のリプシッツ定数は、密集したものよりも大幅に小さくなる可能性があることを理論的に示しています。入力に対して最も重み付けされた専門家が十分に異なる機能を実装すると、MoE の堅牢性が損なわれる可能性があります。次に、敵対的攻撃を使用して ImageNet 上の MoE の堅牢性を経験的に評価し、同じ計算コストの高密度モデルよりも実際に堅牢であることを示します。専門家の選択に対するMoEの堅牢性を示す重要な観察を行い、実際に訓練されたモデルにおける専門家の冗長性を強調しています。

Adversarial robustness is a key desirable property of neural networks. It has been empirically shown to be affected by their sizes, with larger networks being typically more robust. Recently, Bubeck and Sellke proved a lower bound on the Lipschitz constant of functions that fit the training data in terms of their number of parameters. This raises an interesting open question, do -- and can -- functions with more parameters, but not necessarily more computational cost, have better robustness? We study this question for sparse Mixture of Expert models (MoEs), that make it possible to scale up the model size for a roughly constant computational cost. We theoretically show that under certain conditions on the routing and the structure of the data, MoEs can have significantly smaller Lipschitz constants than their dense counterparts. The robustness of MoEs can suffer when the highest weighted experts for an input implement sufficiently different functions. We next empirically evaluate the robustness of MoEs on ImageNet using adversarial attacks and show they are indeed more robust than dense models with the same computational cost. We make key observations showing the robustness of MoEs to the choice of experts, highlighting the redundancy of experts in models trained in practice.

updated: Wed Oct 19 2022 02:24:57 GMT+0000 (UTC)

published: Wed Oct 19 2022 02:24:57 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト