Privacy-Preserving Image Classification Using Vision Transformer

Zheng Qi; AprilPyone MaungMaung; Yuma Kinoshita; Hitoshi Kiya

VisionTransformerを使用したプライバシー保護画像分類

本論文では、暗号化された画像とビジョントランスフォーマー（ViT）の併用に基づくプライバシー保護画像分類法を提案した。提案手法により、視覚情報のない画像をトレーニングとテストの両方でViTモデルに適用できるだけでなく、高い分類精度を維持することもできます。 ViTは、画像パッチにパッチ埋め込みと位置埋め込みを利用するため、このアーキテクチャは、ブロック単位の画像変換の影響を軽減することが示されています。実験では、プライバシーを保護する画像分類のために提案された方法は、分類の精度とさまざまな攻撃に対する堅牢性の点で、最先端の方法よりも優れていることが実証されています。

In this paper, we propose a privacy-preserving image classification method that is based on the combined use of encrypted images and the vision transformer (ViT). The proposed method allows us not only to apply images without visual information to ViT models for both training and testing but to also maintain a high classification accuracy. ViT utilizes patch embedding and position embedding for image patches, so this architecture is shown to reduce the influence of block-wise image transformation. In an experiment, the proposed method for privacy-preserving image classification is demonstrated to outperform state-of-the-art methods in terms of classification accuracy and robustness against various attacks.

updated: Tue May 24 2022 12:51:48 GMT+0000 (UTC)

published: Tue May 24 2022 12:51:48 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト