AGENT: A Benchmark for Core Psychological Reasoning

Tianmin Shu; Abhishek Bhandwaldar; Chuang Gan; Kevin A. Smith; Shari Liu; Dan Gutfreund; Elizabeth Spelke; Joshua B. Tenenbaum; Tomer D. Ullman

エージェント：コア心理学的推論のベンチマーク

機械エージェントが実際の環境で人間とうまく対話するには、人間の精神生活についての理解を深める必要があります。直感的な心理学、つまり観察可能な行動を促進する隠れた精神的変数について推論する能力は、自然に人々にもたらされます。言語前の乳児でさえ、エージェントが制約を与えられた目標を達成するために効率的に行動することを期待して、エージェントをオブジェクトから区別できます。他のエージェントについて推論する機械エージェントへの最近の関心にもかかわらず、そのようなエージェントが人間の推論を推進するコア心理学の原則を学習または保持するかどうかは明らかではありません。直感的な心理学に関する認知発達研究に触発されて、4つのシナリオ（目標の好み、行動の効率、観察されない制約）を中心に構成された、手続き的に生成された3Dアニメーションの大規模なデータセットであるAGENT（アクション、目標、効率、制約、uTility）で構成されるベンチマークを提示します、およびコストと報酬のトレードオフ）は、コアとなる直感的な心理学の重要な概念を精査します。 AGENTを人間の評価で検証し、一般化を強調する評価プロトコルを提案し、ベイズ逆計画と心の理論ニューラルネットワークに基づいて構築された2つの強力なベースラインを比較します。私たちの結果は、人間レベルでのコア直感心理学の設計されたテストに合格するには、モデルは、ユーティリティ計算とオブジェクトと物理学のコア知識を組み合わせて、エージェントが計画する方法の表現を取得または組み込む必要があることを示唆しています。

For machine agents to successfully interact with humans in real-world settings, they will need to develop an understanding of human mental life. Intuitive psychology, the ability to reason about hidden mental variables that drive observable actions, comes naturally to people: even pre-verbal infants can tell agents from objects, expecting agents to act efficiently to achieve goals given constraints. Despite recent interest in machine agents that reason about other agents, it is not clear if such agents learn or hold the core psychology principles that drive human reasoning. Inspired by cognitive development studies on intuitive psychology, we present a benchmark consisting of a large dataset of procedurally generated 3D animations, AGENT (Action, Goal, Efficiency, coNstraint, uTility), structured around four scenarios (goal preferences, action efficiency, unobserved constraints, and cost-reward trade-offs) that probe key concepts of core intuitive psychology. We validate AGENT with human-ratings, propose an evaluation protocol emphasizing generalization, and compare two strong baselines built on Bayesian inverse planning and a Theory of Mind neural network. Our results suggest that to pass the designed tests of core intuitive psychology at human levels, a model must acquire or have built-in representations of how agents plan, combining utility computations and core knowledge of objects and physics.

updated: Mon Jul 26 2021 03:13:11 GMT+0000 (UTC)

published: Wed Feb 24 2021 14:58:23 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト