A Neuro-Symbolic ASP Pipeline for Visual Question Answering

Thomas Eiter; Nelson Higuera; Johannes Oetsch; Michael Pritz

視覚的な質問応答のためのニューロシンボリックASPパイプライン

CLEVRのニューロシンボリックビジュアル質問応答（VQA）パイプラインを紹介します。これは、オブジェクトとそれに関連する質問を含むシーンを示す写真で構成されるよく知られたデータセットです。私たちのパイプラインは、（i）CLEVRシーンのオブジェクト分類とバウンディングボックス予測のためのニューラルネットワークのトレーニング、（ii）信頼性の高い予測のしきい値を決定するためのニューラルネットワークの予測値の分布に関する統計分析、および（iii ）ASPソルバーを使用して回答を計算できるように、信頼性のしきい値を論理プログラムに渡すCLEVR質問とネットワーク予測の変換。選択ルールを活用することにより、決定論的および非決定論的なシーンエンコーディングを検討します。私たちの実験は、ニューラルネットワークが決定論的アプローチと比較してかなり不十分に訓練されている場合でも、非決定論的シーンエンコーディングが良好な結果を達成することを示しています。これは、ネットワーク予測が完全ではない場合に、堅牢なVQAシステムを構築するために重要です。さらに、非決定論を合理的な選択に制限することで、関連する神経記号的アプローチと比較して、精度をあまり失うことなく、より効率的な実装が可能になることを示します。この作業は、TPLPでの受け入れを検討中です。

We present a neuro-symbolic visual question answering (VQA) pipeline for CLEVR, which is a well-known dataset that consists of pictures showing scenes with objects and questions related to them. Our pipeline covers (i) training neural networks for object classification and bounding-box prediction of the CLEVR scenes, (ii) statistical analysis on the distribution of prediction values of the neural networks to determine a threshold for high-confidence predictions, and (iii) a translation of CLEVR questions and network predictions that pass confidence thresholds into logic programs so that we can compute the answers using an ASP solver. By exploiting choice rules, we consider deterministic and non-deterministic scene encodings. Our experiments show that the non-deterministic scene encoding achieves good results even if the neural networks are trained rather poorly in comparison with the deterministic approach. This is important for building robust VQA systems if network predictions are less-than perfect. Furthermore, we show that restricting non-determinism to reasonable choices allows for more efficient implementations in comparison with related neuro-symbolic approaches without loosing much accuracy. This work is under consideration for acceptance in TPLP.

updated: Mon May 16 2022 09:50:37 GMT+0000 (UTC)

published: Mon May 16 2022 09:50:37 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト