DavarOCR: A Toolbox for OCR and Multi-Modal Document Understanding

Liang Qiao; Hui Jiang; Ying Chen; Can Li; Pengfei Li; Zaisheng Li; Baorui Zou; Dashan Guo; Yingda Xu; Yunlu Xu; Zhanzhan Cheng; Yi Niu

DavarOCR：OCRおよびマルチモーダルドキュメント理解のためのツールボックス

このホワイトペーパーでは、OCRおよびドキュメント理解タスク用のオープンソースツールボックスであるDavarOCRについて説明します。 DavarOCRは現在、9つの異なるタスクフォームをカバーする19の高度なアルゴリズムを実装しています。 DavarOCRは、各アルゴリズムの詳細な使用方法とトレーニング済みモデルを提供します。以前のオープンソースOCRツールボックスと比較して、DavarOCRは、ドキュメント理解の最先端テクノロジーのサブタスクを比較的完全にサポートしています。学界や産業界でのOCRテクノロジーの開発と応用を促進するために、テクノロジーのさまざまなサブドメインが共有できるモジュールの使用にさらに注意を払っています。 DavarOCRは、https：//github.com/hikopensource/Davar-Lab-OCRで公開されています。

This paper presents DavarOCR, an open-source toolbox for OCR and document understanding tasks. DavarOCR currently implements 19 advanced algorithms, covering 9 different task forms. DavarOCR provides detailed usage instructions and the trained models for each algorithm. Compared with the previous opensource OCR toolbox, DavarOCR has relatively more complete support for the sub-tasks of the cutting-edge technology of document understanding. In order to promote the development and application of OCR technology in academia and industry, we pay more attention to the use of modules that different sub-domains of technology can share. DavarOCR is publicly released at https://github.com/hikopensource/Davar-Lab-OCR.

updated: Thu Jul 14 2022 06:54:47 GMT+0000 (UTC)

published: Thu Jul 14 2022 06:54:47 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト