TabIQA: Table Questions Answering on Business Document Images

Phuc Nguyen; Nam Tuan Ly; Hideaki Takeda; Atsuhiro Takasu

TabIQA: ビジネス文書の画像に対する表の質問への回答

ビジネスドキュメントからの質問に答える表には、表形式の構造、ドキュメント間の参照、および単純な検索クエリを超えた追加の数値計算を理解する必要がある多くの課題があります。このホワイトペーパーでは、TabIQA という名前の新しいパイプラインを紹介し、ビジネスドキュメントの画像に関する質問に答えます。 TabIQA は、最先端の深層学習技術を組み合わせて、1) 画像から表のコンテンツと構造情報を抽出し、2) 数値データ、テキストベースの情報、および構造化された表からの複雑なクエリに関連するさまざまな質問に答えます。 VQAonBD 2023 データセットの評価結果は、テーブル関連の質問への回答において有望なパフォーマンスを達成する上での TabIQA の有効性を示しています。 TabIQA リポジトリは、https://github.com/phucty/itabqa で入手できます。

Table answering questions from business documents has many challenges that require understanding tabular structures, cross-document referencing, and additional numeric computations beyond simple search queries. This paper introduces a novel pipeline, named TabIQA, to answer questions about business document images. TabIQA combines state-of-the-art deep learning techniques 1) to extract table content and structural information from images and 2) to answer various questions related to numerical data, text-based information, and complex queries from structured tables. The evaluation results on VQAonBD 2023 dataset demonstrate the effectiveness of TabIQA in achieving promising performance in answering table-related questions. The TabIQA repository is available at https://github.com/phucty/itabqa.

updated: Mon Mar 27 2023 06:31:21 GMT+0000 (UTC)

published: Mon Mar 27 2023 06:31:21 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト