Workshop on Document Intelligence Understanding

Soyeon Caren Han; Yihao Ding; Siwen Luo; Josiah Poon; HeeGuen Yoon; Zhe Huang; Paul Duuring; Eun Jung Holden

ドキュメントインテリジェンスの理解に関するワークショップ

文書の理解と情報の抽出には、文書を理解して貴重な情報を自動的に抽出するためのさまざまなタスクが含まれます。近年、大量の文書に関わる業務の効率化を図るため、ビジネス、法律、医療などさまざまな分野で文書の理解を深めたいというニーズが高まっています。このワークショップは、ドキュメントインテリジェンスの分野の研究者と業界開発者を集め、さまざまな種類のドキュメントを理解し、自動ドキュメント処理と理解技術を強化することを目的としています。また、最近導入されたドキュメントレベルの VQA データセットである PDFVQA に関するデータチャレンジもリリースしました。 PDFVQA チャレンジでは、完全な文書の複数ページから抽出された一連の回答を含む質問を含めることにより、複数の連続する文書ページの自然な完全な文書レベルで提案されたモデルの構造的および文脈的理解を検証します。このタスクは、文書の理解ステップを単一ページレベルから完全な文書レベルの理解まで高めるのに役立ちます。

Document understanding and information extraction include different tasks to understand a document and extract valuable information automatically. Recently, there has been a rising demand for developing document understanding among different domains, including business, law, and medicine, to boost the efficiency of work that is associated with a large number of documents. This workshop aims to bring together researchers and industry developers in the field of document intelligence and understanding diverse document types to boost automatic document processing and understanding techniques. We also released a data challenge on the recently introduced document-level VQA dataset, PDFVQA. The PDFVQA challenge examines the structural and contextual understandings of proposed models on the natural full document level of multiple consecutive document pages by including questions with a sequence of answers extracted from multi-pages of the full document. This task helps to boost the document understanding step from the single-page level to the full document level understanding.

updated: Mon Jul 31 2023 02:14:25 GMT+0000 (UTC)

published: Mon Jul 31 2023 02:14:25 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト