Privacy-Preserving CNN Training with Transfer Learning

John Chiang

転移学習によるプライバシー保護 CNN トレーニング

プライバシーを保護するニューラルネットワークの推論は十分に研究されていますが、準同型 CNN トレーニングは依然として未解決の挑戦的なタスクのままです。この論文では、単なる準同型暗号化 (HE) 技術に基づいて、プライバシーを保護する CNN トレーニングを実装するための実用的なソリューションを提示します。私たちの知る限り、これはこの問題を解決することに成功した最初の試みであり、これまでこの目標を達成した仕事はありません.いくつかの手法を組み合わせてそれを実現します。(1) 転移学習を使用すると、プライバシーを保護する CNN トレーニングを、準同型ニューラルネットワークトレーニング、またはマルチクラスロジスティック回帰 (MLR) トレーニングにまで減らすことができます。 (2) Quadratic Gradient と呼ばれるより高速な勾配バリアントを介して、収束速度で最先端のパフォーマンスを備えた MLR の拡張勾配法がこの作業に適用され、高いパフォーマンスを実現します。 (3) 数学の変換の考え方を使用して、暗号化ドメインの近似 Softmax 関数をシグモイド関数のよく研究された近似に変換します。この変化を補完するために、新しいタイプの損失関数が開発されました。 (4) Volley Revolver という名前のシンプルかつ柔軟なマトリックスエンコーディングメソッドを使用して、暗号文のデータフローを管理します。これは、準同型 CNN トレーニング全体を完了するための重要な要素です。私たちの作業を実装するための完全で実行可能な C++ コードは、https://github.com/petitioner/HE.CNNtraining にあります。転移学習を使用するための事前トレーニングモデルとして、REGNET\_X\_400MF を選択します。最初の 128 枚の MNIST トレーニング画像をトレーニングデータとして使用し、MNIST テストデータセット全体をテストデータとして使用します。クライアントは 6 つの暗号文をクラウドにアップロードするだけで済み、64 個の vCPU を備えたクラウドで 2 回の反復を実行するのに約 21 分かかり、21.49% の精度が得られます。

Privacy-preserving nerual network inference has been well studied while homomorphic CNN training still remains an open challenging task. In this paper, we present a practical solution to implement privacy-preserving CNN training based on mere Homomorphic Encryption (HE) technique. To our best knowledge, this is the first attempt successfully to crack this nut and no work ever before has achieved this goal. Several techniques combine to make it done: (1) with transfer learning, privacy-preserving CNN training can be reduced to homomorphic neural network training, or even multiclass logistic regression (MLR) training; (2) via a faster gradient variant called Quadratic Gradient, an enhanced gradient method for MLR with a state-of-the-art performance in converge speed is applied in this work to achieve high performance; (3) we employ the thought of transformation in mathematics to transform approximating Softmax function in encryption domain to the well-studied approximation of Sigmoid function. A new type of loss function is alongside been developed to complement this change; and (4) we use a simple but flexible matrix-encoding method named Volley Revolver to manage the data flow in the ciphertexts, which is the key factor to complete the whole homomorphic CNN training. The complete, runnable C++ code to implement our work can be found at: https://github.com/petitioner/HE.CNNtraining. We select REGNET\_X\_400MF as our pre-train model for using transfer learning. We use the first 128 MNIST training images as training data and the whole MNIST testing dataset as the testing data. The client only needs to upload 6 ciphertexts to the cloud and it takes ∼21 mins to perform 2 iterations on a cloud with 64 vCPUs, resulting in a precision of 21.49%.

updated: Fri Apr 07 2023 18:21:30 GMT+0000 (UTC)

published: Fri Apr 07 2023 18:21:30 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト