Fine-tuning Vision Transformers for the Prediction of State Variables in Ising Models

Onur Kara; Arijit Sehanobish; Hector H Corzo

イジングモデルの状態変数を予測するためのビジョントランスフォーマーの微調整

トランスフォーマーは、スタックされた注意と、シーケンシャルデータを処理するために設計されたポイントごとの完全に接続されたレイヤーで構成される最先端の深層学習モデルです。トランスフォーマーは、自然言語処理（NLP）全体に遍在しているだけでなく、最近、コンピュータービジョン（CV）アプリケーション研究の新しい波に影響を与えています。この作業では、ビジョントランスフォーマー（ViT）を適用して、2次元イジングモデルシミュレーションの状態変数を予測します。私たちの実験は、さまざまな境界条件と温度に対応するイジングモデルからの少数のミクロ状態画像を使用すると、ViTが最先端の畳み込みニューラルネットワーク（CNN）よりも優れていることを示しています。この作業は、ViTを他のシミュレーションに適用する可能性を開き、注意マップがさまざまな現象を支配する基礎となる物理学についてどのように学習できるかについての興味深い研究の方向性を示します。

Transformers are state-of-the-art deep learning models that are composed of stacked attention and point-wise, fully connected layers designed for handling sequential data. Transformers are not only ubiquitous throughout Natural Language Processing (NLP), but, recently, they have inspired a new wave of Computer Vision (CV) applications research. In this work, a Vision Transformer (ViT) is applied to predict the state variables of 2-dimensional Ising model simulations. Our experiments show that ViT outperform state-of-the-art Convolutional Neural Networks (CNN) when using a small number of microstate images from the Ising model corresponding to various boundary conditions and temperatures. This work opens the possibility of applying ViT to other simulations, and raises interesting research directions on how attention maps can learn about the underlying physics governing different phenomena.

updated: Tue Nov 30 2021 04:27:14 GMT+0000 (UTC)

published: Tue Sep 28 2021 00:23:31 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト