Convolutional Neural Networks Trained to Identify Words Provide a Good Account of Visual Form Priming Effects

Dong Yin; Valerio Biscione; Jeffrey Bowers

単語を識別するように訓練された畳み込みニューラルネットワークは、ビジュアルフォームのプライミング効果を適切に説明します

文字列間の正書法の類似性の尺度を提供するマスクされたプライミングデータを説明するために、さまざまな正書法のコーディングスキームと視覚的な単語識別のモデルが開発されています。これらのモデルには、特定の形式の知識の単一単位コーディング (たとえば、特定の位置の文字をコーディングする単位) を使用して手書きでコーディングされた正書法表現が含まれる傾向があります。ここでは、これらのコーディングスキームとモデルの範囲が、フォームプライミングプロジェクトから取得したフォームプライミング効果のパターンをどの程度うまく説明しているかを評価し、これらの調査結果を、コンピューターサイエンスで開発された 11 の標準的なディープニューラルネットワークモデル (DNN) で観察された結果と比較します。深層畳み込みネットワーク (CNN) は、コーディングスキームや単語認識モデルと同等またはそれ以上のパフォーマンスを発揮しますが、トランスフォーマーネットワークのパフォーマンスは劣ります。 CNN の成功は、そのアーキテクチャが単語認識をサポートするように開発されておらず (オブジェクト認識でうまく機能するように設計されていた)、(文字列の人工的なエンコーディングではなく) 単語のピクセル画像を分類し、トレーニングが非常に単純化されていたため、注目に値します。（人間の経験の多くの重要な側面を尊重していません）。これらのフォームプライミング効果に加えて、DNN は、現在のすべてのプライミングの心理モデルを超える、プライミングに対する視覚的類似性効果を説明できることがわかりました。この調査結果は、(Hannagan et al., 2021) の最近の研究に追加され、人間の視覚的な単語認識のモデルとして、心理学において CNN にもっと注意を払う必要があることを示唆しています。

A wide variety of orthographic coding schemes and models of visual word identification have been developed to account for masked priming data that provide a measure of orthographic similarity between letter strings. These models tend to include hand-coded orthographic representations with single unit coding for specific forms of knowledge (e.g., units coding for a letter in a given position). Here we assess how well a range of these coding schemes and models account for the pattern of form priming effects taken from the Form Priming Project and compare these findings to results observed with 11 standard deep neural network models (DNNs) developed in computer science. We find that deep convolutional networks (CNNs) perform as well or better than the coding schemes and word recognition models, whereas transformer networks did less well. The success of CNNs is remarkable as their architectures were not developed to support word recognition (they were designed to perform well on object recognition), they classify pixel images of words (rather than artificial encodings of letter strings), and their training was highly simplified (not respecting many key aspects of human experience). In addition to these form priming effects, we find that the DNNs can account for visual similarity effects on priming that are beyond all current psychological models of priming. The findings add to the recent work of (Hannagan et al., 2021) and suggest that CNNs should be given more attention in psychology as models of human visual word recognition.

updated: Thu Mar 02 2023 18:29:24 GMT+0000 (UTC)

published: Wed Feb 08 2023 11:01:19 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト