Crowding in humans is unlike that in convolutional neural networks

Ben Lonnqvist; Alasdair D. F. Clarke; Ramakrishna Chakravarthi

人間の混雑は畳み込みニューラルネットワークの混雑とは異なります

オブジェクト認識は、人間の視覚システムの主要な機能です。最近、一連の緊急コンピュータビジョンシステム（Deep Convolutional Neural Networks（DCNN））でオブジェクトを認識する非常に成功した能力が、人間の認識の有用なガイドになると主張されています。このアサーションをテストするために、DCNNでの視覚的な混雑、認識の劇的な内訳を体系的に評価し、それらのパフォーマンスを人間の現存する研究と比較しました。人間の間で使用されているのと同じ方法で、DCNNの3つのアーキテクチャでの混雑を調べました。 DCNNの混雑の程度と形状を評価するために、文字間の間隔、文字の色、サイズ、フランカーの位置など、複数の刺激因子を操作しました。混雑は、人間とは異なるアーキテクチャ間で予測可能なパターンに従っていることがわかりました。サイズへの不変性、ターゲットフランカーの類似性の影響、ターゲットとフランカーのアイデンティティ間の混乱など、人間の混雑のいくつかの特徴的な特徴は、完全に欠落、最小化、さらには逆転しました。これらのデータは、DCNNはオブジェクト認識に精通しているが、人間とは異なる一連のメカニズムを通じてこの能力を達成している可能性が高いことを示しています。それらは必ずしも人間または霊長類の物体認識の同等のモデルではなく、それらの操作から派生したメカニズムを推測する際には注意が必要です。

Object recognition is a primary function of the human visual system. It has recently been claimed that the highly successful ability to recognise objects in a set of emergent computer vision systems---Deep Convolutional Neural Networks (DCNNs)---can form a useful guide to recognition in humans. To test this assertion, we systematically evaluated visual crowding, a dramatic breakdown of recognition in clutter, in DCNNs and compared their performance to extant research in humans. We examined crowding in three architectures of DCNNs with the same methodology as that used among humans. We manipulated multiple stimulus factors including inter-letter spacing, letter colour, size, and flanker location to assess the extent and shape of crowding in DCNNs. We found that crowding followed a predictable pattern across architectures that was different from that in humans. Some characteristic hallmarks of human crowding, such as invariance to size, the effect of target-flanker similarity, and confusions between target and flanker identities, were completely missing, minimised or even reversed. These data show that DCNNs, while proficient in object recognition, likely achieve this competence through a set of mechanisms that are distinct from those in humans. They are not necessarily equivalent models of human or primate object recognition and caution must be exercised when inferring mechanisms derived from their operation.

updated: Mon Nov 25 2019 12:43:09 GMT+0000 (UTC)

published: Fri Mar 01 2019 12:03:19 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト