Visual Search Asymmetry: Deep Nets and Humans Share Similar Inherent Biases

Shashi Kant Gupta; Mengmi Zhang; Chia-Chien Wu; Jeremy M. Wolfe; Gabriel Kreiman

視覚探索の非対称性：ディープネットと人間は同様の固有のバイアスを共有します

視覚探索は、家にいる車のキーや群衆の中にいる友人を探すなど、どこにでもあり、しばしば困難な日常業務です。いくつかの古典的な検索タスクの興味深い特性は、ディストラクタBの中からターゲットAを見つける方がAの中からBを見つけるよりも簡単であるような非対称性です。視覚検索における非対称性の原因となるメカニズムを解明するために、ターゲットと入力として画像を検索し、ターゲットが見つかるまで一連の眼球運動を生成します。モデルは、離心率に依存する視覚認識をターゲットに依存するトップダウンキューと統合します。人間の非対称性を示す6つのパラダイム検索タスクで、モデルを人間の行動と比較しました。刺激やタスク固有のトレーニングに事前にさらされることなく、モデルは検索の非対称性のもっともらしいメカニズムを提供します。検索の非対称性の極性は、自然環境での経験から生じると仮定しました。自然画像のバイアスが除去または逆転されたImageNetの拡張バージョンでモデルをトレーニングすることにより、この仮説をテストしました。検索の非対称性の極性は、トレーニングプロトコルに応じて消えるか、変更されました。この研究は、タスク固有のトレーニングを必要とせずに、モデルに与えられた発達食の統計的特性の結果として、古典的な知覚特性がニューラルネットワークモデルにどのように現れるかを強調しています。すべてのソースコードとデータは、https：//github.com/kreimanlab/VisualSearchAsymmetryで公開されています。

Visual search is a ubiquitous and often challenging daily task, exemplified by looking for the car keys at home or a friend in a crowd. An intriguing property of some classical search tasks is an asymmetry such that finding a target A among distractors B can be easier than finding B among A. To elucidate the mechanisms responsible for asymmetry in visual search, we propose a computational model that takes a target and a search image as inputs and produces a sequence of eye movements until the target is found. The model integrates eccentricity-dependent visual recognition with target-dependent top-down cues. We compared the model against human behavior in six paradigmatic search tasks that show asymmetry in humans. Without prior exposure to the stimuli or task-specific training, the model provides a plausible mechanism for search asymmetry. We hypothesized that the polarity of search asymmetry arises from experience with the natural environment. We tested this hypothesis by training the model on augmented versions of ImageNet where the biases of natural images were either removed or reversed. The polarity of search asymmetry disappeared or was altered depending on the training protocol. This study highlights how classical perceptual properties can emerge in neural network models, without the need for task-specific training, but rather as a consequence of the statistical properties of the developmental diet fed to the model. All source code and data are publicly available at https://github.com/kreimanlab/VisualSearchAsymmetry.

updated: Sun Nov 07 2021 03:20:34 GMT+0000 (UTC)

published: Sat Jun 05 2021 19:46:42 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト