Machine versus Human Attention in Deep Reinforcement Learning Tasks

Ruohan Zhang; Sihang Guo; Bo Liu; Yifeng Zhu; Mary Hayhoe; Dana Ballard; Peter Stone

深層強化学習タスクにおける機械対人間の注意

深層強化学習（RL）アルゴリズムは、視覚運動の決定タスクを解決するための強力なツールです。ただし、トレーニングされたモデルは、エンドツーエンドのディープニューラルネットワークとして表されるため、解釈が難しいことがよくあります。この論文では、そのような訓練されたモデルの内部の仕組みを、タスクの実行中にそれらが参加するピクセルを分析し、同じタスクを実行する人間が参加するピクセルと比較することによって明らかにします。この目的のために、私たちの知る限り、これまで研究されていなかった次の2つの質問を調査します。 1）同じタスクを実行するときにRLエージェントと人間が学習する視覚的特徴はどの程度似ていますか？ 2）これらの学習された機能の類似点と相違点は、これらのタスクでのRLエージェントのパフォーマンスをどのように説明しますか？具体的には、アタリゲームのプレイ方法を学習する際に、RLエージェントの顕著性マップを人間の専門家の視覚的注意モデルと比較します。さらに、ディープRLアルゴリズムのハイパーパラメータが、トレーニングされたエージェントの学習された特徴と顕著性マップにどのように影響するかを分析します。私たちの結果によって提供された洞察は、人間の専門家と深いRLエージェントの間のパフォーマンスのギャップを埋めるために新しいアルゴリズムに情報を与える可能性があります。

Deep reinforcement learning (RL) algorithms are powerful tools for solving visuomotor decision tasks. However, the trained models are often difficult to interpret, because they are represented as end-to-end deep neural networks. In this paper, we shed light on the inner workings of such trained models by analyzing the pixels that they attend to during task execution, and comparing them with the pixels attended to by humans executing the same tasks. To this end, we investigate the following two questions that, to the best of our knowledge, have not been previously studied. 1) How similar are the visual features learned by RL agents and humans when performing the same task? and, 2) How do similarities and differences in these learned features explain RL agents' performance on these tasks? Specifically, we compare the saliency maps of RL agents against visual attention models of human experts when learning to play Atari games. Further, we analyze how hyperparameters of the deep RL algorithm affect the learned features and saliency maps of the trained agents. The insights provided by our results have the potential to inform novel algorithms for the purpose of closing the performance gap between human experts and deep RL agents.

updated: Wed Feb 10 2021 21:25:41 GMT+0000 (UTC)

published: Thu Oct 29 2020 20:58:45 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト