V1T: large-scale mouse V1 response prediction using a Vision Transformer

Bryan M. Li; Isabel M. Cornacchia; Nathalie L. Rochefort; Arno Onken

V1T: Vision Transformer を使用した大規模マウス V1 応答予測

自然な視覚刺激に対する視覚野の神経反応の正確な予測モデルは、計算論的神経科学における課題のままです。この研究では、動物間で共有される視覚的および行動的表現を学習する新しいビジョントランスフォーマーベースのアーキテクチャである V1T を紹介します。私たちはマウスの一次視覚野から記録された 2 つの大規模なデータセットに基づいてモデルを評価し、予測パフォーマンスにおいて以前の畳み込みベースのモデルを 12.7% 以上上回りました。さらに、Transformer によって学習された自己注意の重みが集団の受容野と相関していることを示します。したがって、私たちのモデルは神経反応予測の新しいベンチマークを設定し、行動記録および神経記録と併用して視覚野の意味のある特徴を明らかにすることができます。

Accurate predictive models of the visual cortex neural response to natural visual stimuli remain a challenge in computational neuroscience. In this work, we introduce V1T, a novel Vision Transformer based architecture that learns a shared visual and behavioral representation across animals. We evaluate our model on two large datasets recorded from mouse primary visual cortex and outperform previous convolution-based models by more than 12.7% in prediction performance. Moreover, we show that the self-attention weights learned by the Transformer correlate with the population receptive fields. Our model thus sets a new benchmark for neural response prediction and can be used jointly with behavioral and neural recordings to reveal meaningful characteristic features of the visual cortex.

updated: Tue Sep 05 2023 17:56:42 GMT+0000 (UTC)

published: Mon Feb 06 2023 18:58:38 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト