Gaze Perception in Humans and CNN-Based Model

Nicole X. Han; William Yang Wang; Miguel P. Eckstein

人間の視線知覚とCNNベースのモデル

他の個人の注意の場所について正確な推論を行うことは、人間の社会的相互作用にとって不可欠であり、AIが人間と効果的に相互作用するために重要になります。この研究では、CNN（畳み込みニューラルネットワーク）ベースの視線と人間のモデルが、現実世界のシーンの画像における注意の軌跡を、共通の場所を見ている多くの個人とどのように推測するかを比較します。モデルと比較して、人間の注意の軌跡の推定は、注目するターゲットの存在や画像内の個人の数など、シーンのコンテキストによってより影響を受けることを示します。

Making accurate inferences about other individuals' locus of attention is essential for human social interactions and will be important for AI to effectively interact with humans. In this study, we compare how a CNN (convolutional neural network) based model of gaze and humans infer the locus of attention in images of real-world scenes with a number of individuals looking at a common location. We show that compared to the model, humans' estimates of the locus of attention are more influenced by the context of the scene, such as the presence of the attended target and the number of individuals in the image.

updated: Sat Apr 17 2021 04:52:46 GMT+0000 (UTC)

published: Sat Apr 17 2021 04:52:46 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト