Weakly Supervised Visual Question Answer Generation

Charani Alampalle; Shamanthak Hegde; Soumya Jahagirdar; Shankar Gangisetty

弱く教師付きの視覚的な質問と回答の生成

対話型エージェントへの関心の高まりにより、視覚的な質問をしたり答えたりする人間とコンピュータの双方向通信が促進され、AI の研究が活発な分野になっています。したがって、視覚的な質問と回答のペアの生成は重要かつ困難なタスクになります。この問題に対処するために、与えられた入力画像と関連するキャプションに関連する質問と回答のペアを生成する、弱教師付き視覚的質問回答生成方法を提案します。これまでの研究のほとんどは監視されており、注釈付きの質問と回答のデータセットに依存しています。私たちの研究では、視覚情報とキャプションから質問と回答のペアを手続き的に合成的に生成する、弱教師ありの方法を紹介します。提案手法では、まず回答語のリストを抽出し、次にキャプションと回答語を利用した最近接質問生成を行って合成質問を生成する。次に、関連する質問ジェネレーターは、依存関係の解析と順序ツリーの走査によって、最も近い質問を関連する言語の質問に変換し、最後に、最後に生成された質問と回答のペアを使用して ViLBERT モデルを微調整します。 VQA データセットに対して徹底的な実験分析を実行し、BLEU スコアに関してモデルが SOTA メソッドよりも大幅に優れていることがわかりました。また、ベースラインモデルとアブレーション研究に関する結果も示します。

Growing interest in conversational agents promote twoway human-computer communications involving asking and answering visual questions have become an active area of research in AI. Thus, generation of visual questionanswer pair(s) becomes an important and challenging task. To address this issue, we propose a weakly-supervised visual question answer generation method that generates a relevant question-answer pairs for a given input image and associated caption. Most of the prior works are supervised and depend on the annotated question-answer datasets. In our work, we present a weakly supervised method that synthetically generates question-answer pairs procedurally from visual information and captions. The proposed method initially extracts list of answer words, then does nearest question generation that uses the caption and answer word to generate synthetic question. Next, the relevant question generator converts the nearest question to relevant language question by dependency parsing and in-order tree traversal, finally, fine-tune a ViLBERT model with the question-answer pair(s) generated at end. We perform an exhaustive experimental analysis on VQA dataset and see that our model significantly outperform SOTA methods on BLEU scores. We also show the results wrt baseline models and ablation study.

updated: Mon Sep 11 2023 07:11:14 GMT+0000 (UTC)

published: Sun Jun 11 2023 08:46:42 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト