Interpretable Visual Reasoning via Induced Symbolic Space

Zhonghao Wang; Kai Wang; Mo Yu; Jinjun Xiong; Wen-mei Hwu; Mark Hasegawa-Johnson; Humphrey Shi

誘発された象徴的空間を介した解釈可能な視覚的推論

視覚的推論における概念帰納の問題を研究します。つまり、画像に関連付けられた質問と回答のペアから概念とその階層関係を特定します。誘発された象徴的な概念空間に取り組むことにより、解釈可能なモデルを実現します。この目的のために、まず、オブジェクト中心の構成的注意モデル（OCCAM）という名前の新しいフレームワークを設計して、オブジェクトレベルの視覚的機能を備えた視覚的推論タスクを実行します。次に、オブジェクトの視覚的特徴と疑問詞の間の注意パターンからの手がかりを使用して、オブジェクトの概念と関係を誘導する方法を考え出します。最後に、誘導されたシンボリックコンセプト空間で表されるオブジェクトにOCCAMを課すことにより、より高いレベルの解釈可能性を実現します。私たちのモデル設計では、最初にオブジェクトと関係の概念を予測し、次に予測された概念を視覚的特徴空間に投影して、構成推論モジュールが正常に処理できるようにすることで、これを簡単に適応させます。 CLEVRおよびGQAデータセットでの実験は、次のことを示しています。1）OCCAMは、人間が注釈を付けた関数型プログラムなしで、新しい最先端技術を実現します。 2）OCCAMは、視覚的特徴または誘発された象徴的な概念空間のいずれかで表されるオブジェクトで同等のパフォーマンスを達成するため、誘発された概念は正確かつ十分です。

We study the problem of concept induction in visual reasoning, i.e., identifying concepts and their hierarchical relationships from question-answer pairs associated with images; and achieve an interpretable model via working on the induced symbolic concept space. To this end, we first design a new framework named object-centric compositional attention model (OCCAM) to perform the visual reasoning task with object-level visual features. Then, we come up with a method to induce concepts of objects and relations using clues from the attention patterns between objects' visual features and question words. Finally, we achieve a higher level of interpretability by imposing OCCAM on the objects represented in the induced symbolic concept space. Our model design makes this an easy adaption via first predicting the concepts of objects and relations and then projecting the predicted concepts back to the visual feature space so the compositional reasoning module can process normally. Experiments on the CLEVR and GQA datasets demonstrate: 1) our OCCAM achieves a new state of the art without human-annotated functional programs; 2) our induced concepts are both accurate and sufficient as OCCAM achieves an on-par performance on objects represented either in visual features or in the induced symbolic concept space.

updated: Tue Aug 24 2021 13:55:14 GMT+0000 (UTC)

published: Mon Nov 23 2020 18:21:49 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト