Scaling Up Influence Functions

Andrea Schioppa; Polina Zablotskaia; David Vilar; Artem Sokolov

インフルエンス関数のスケールアップ

予測をトレーニングデータに追跡するための影響関数の効率的な計算に取り組みます。アーノルディ法に基づく逆ヘッセ計算を高速化するための新しいアプローチを提案し、分析します。この改善により、私たちの知る限り、数億のパラメーターを持つフルサイズ（言語およびビジョン）のTransformerモデルにスケーリングする影響関数の最初の実装に成功しました。数千万から数億のトレーニング例を使用して、画像分類とシーケンス間のタスクに関するアプローチを評価します。私たちのコードはhttps://github.com/google-research/jax-influenceで入手できます。

We address efficient calculation of influence functions for tracking predictions back to the training data. We propose and analyze a new approach to speeding up the inverse Hessian calculation based on Arnoldi iteration. With this improvement, we achieve, to the best of our knowledge, the first successful implementation of influence functions that scales to full-size (language and vision) Transformer models with several hundreds of millions of parameters. We evaluate our approach on image classification and sequence-to-sequence tasks with tens to a hundred of millions of training examples. Our code will be available at https://github.com/google-research/jax-influence.

updated: Mon Dec 06 2021 13:54:08 GMT+0000 (UTC)

published: Mon Dec 06 2021 13:54:08 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト