More than meets the eye: Self-supervised depth reconstruction from brain activity

Guy Gaziv; Michal Irani

目に見える以上のもの：脳活動からの自己監視深度再構築

過去数年間で、深層学習ツールを使用して、fMRI 脳記録から観察された自然画像の再構築が大幅に進歩しました。ここでは、初めて、観察された 2D 自然画像の高密度 3D 深度マップも fMRI 脳記録から直接回復できることを示します。既成の方法を使用して、自然画像の未知の深度マップを推定します。これは、(i) fMRI スキャナーで被験者に提示される少数の画像 (fMRI 記録がある画像 - 「ペア」データと呼ばれる)、および (ii) 非常に多数の自然画像に適用されます。 fMRI 記録なし (「ペアになっていないデータ」)。推定された深度マップは、fMRI から直接深度再構成をトレーニングするための補助再構成基準として使用されます。深度のみの回復とジョイント画像深度の RGBD 回復という 2 つの主なアプローチを提案します。利用可能な「ペアの」トレーニングデータ (fMRI を使用した画像) の数が少ないため、多くの「ペアになっていない」データ (fMRI なしの自然画像と深度マップ) での自己監視サイクル一貫性トレーニングを介してトレーニングデータを充実させます。これは、新たに定義され訓練された深度ベースの知覚類似度メトリクスを再構成基準として使用して達成されます。 fMRIから直接深度マップを予測することは、再構成された画像からの間接的な順次回復よりも優れていることを示しています。さらに、初期の皮質視覚領域からの活性化が深度再構成結果を支配することを示し、深度情報チューニングの程度によって fMRI ボクセルを特徴付ける手段を提案します。この作業により、解読された情報の重要なレイヤーが追加され、視覚的な脳の解読機能の現在のエンベロープが拡張されます。

In the past few years, significant advancements were made in reconstruction of observed natural images from fMRI brain recordings using deep-learning tools. Here, for the first time, we show that dense 3D depth maps of observed 2D natural images can also be recovered directly from fMRI brain recordings. We use an off-the-shelf method to estimate the unknown depth maps of natural images. This is applied to both: (i) the small number of images presented to subjects in an fMRI scanner (images for which we have fMRI recordings - referred to as "paired" data), and (ii) a very large number of natural images with no fMRI recordings ("unpaired data"). The estimated depth maps are then used as an auxiliary reconstruction criterion to train for depth reconstruction directly from fMRI. We propose two main approaches: Depth-only recovery and joint image-depth RGBD recovery. Because the number of available "paired" training data (images with fMRI) is small, we enrich the training data via self-supervised cycle-consistent training on many "unpaired" data (natural images & depth maps without fMRI). This is achieved using our newly defined and trained Depth-based Perceptual Similarity metric as a reconstruction criterion. We show that predicting the depth map directly from fMRI outperforms its indirect sequential recovery from the reconstructed images. We further show that activations from early cortical visual areas dominate our depth reconstruction results, and propose means to characterize fMRI voxels by their degree of depth-information tuning. This work adds an important layer of decoded information, extending the current envelope of visual brain decoding capabilities.

updated: Wed Jun 09 2021 14:46:09 GMT+0000 (UTC)

published: Wed Jun 09 2021 14:46:09 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト