Transformer based Urdu Handwritten Text Optical Character Reader

Mohammad Daniyal Shaiq; Musa Dildar Ahmed Cheema; Ali Kamal

トランスフォーマーベースのウルドゥー語手書きテキスト光学式文字リーダー

手書きのテキストの抽出は、情報をデジタル化し、大規模な設定に利用できるようにするための最も重要なコンポーネントの1つです。手書き光学式文字リーダー（OCR）は、コンピュータービジョンと自然言語処理コンピューティングの研究課題であり、英語では多くの作業が行われていますが、残念ながら、Urduなどのリソースの少ない言語ではほとんど作業が行われていません。ウルドゥー語の文字は、筆記体であり、相対的な位置に応じて文字の形が変化するため、非常に難しいため、複雑な特徴を理解し、あらゆる種類の手書きスタイルに一般化できるモデルを提案する必要があります。この作業では、トランスフォーマーベースのウルドゥー語手書きテキスト抽出モデルを提案します。トランスフォーマーは自然言語理解タスクで非常に成功しているため、複雑なウルドゥー語の手書きを理解するためにトランスフォーマーをさらに詳しく調べます。

Extracting Handwritten text is one of the most important components of digitizing information and making it available for large scale setting. Handwriting Optical Character Reader (OCR) is a research problem in computer vision and natural language processing computing, and a lot of work has been done for English, but unfortunately, very little work has been done for low resourced languages such as Urdu. Urdu language script is very difficult because of its cursive nature and change of shape of characters based on it's relative position, therefore, a need arises to propose a model which can understand complex features and generalize it for every kind of handwriting style. In this work, we propose a transformer based Urdu Handwritten text extraction model. As transformers have been very successful in Natural Language Understanding task, we explore them further to understand complex Urdu Handwriting.

updated: Thu Jun 09 2022 15:43:35 GMT+0000 (UTC)

published: Thu Jun 09 2022 15:43:35 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト