Continuous Offline Handwriting Recognition using Deep Learning Models

Jorge Sueiras

ディープラーニングモデルを使用した継続的なオフライン手書き認識

手書きのテキスト認識は、自動ドキュメント画像分析の分野で非常に興味深い未解決の問題です。デジタル化された文書に存在する手書きのコンテンツの転記は、過去のアーカイブを分析したり、手書きの文書、フォーム、および通信から情報をデジタル化したりする上で重要です。ここ数年、ディープラーニング技術をその解決に適用したことにより、この分野で大きな進歩が見られました。この論文は、オフラインの連続手書きテキスト認識（HTR）の問題に対処します。これは、テキストを文字にセグメント化することなく、画像に存在するテキストを転記できるアルゴリズムとモデルの開発で構成されます。この目的のために、畳み込みニューラルネットワーク（CNN）モデルとシーケンス間（seq2seq）モデルの2種類の深層学習アーキテクチャの統合に基づく新しい認識モデルを提案しました。モデルの畳み込みコンポーネントは、文字に存在する関連する機能を識別するように方向付けられており、seq2seqコンポーネントは、テキストのシーケンシャルな性質をモデル化することにより、テキストの文字起こしを構築します。この新しいモデルの設計では、連続モデルに統合するのに最適なものを特定するために、孤立した文字認識の単純化された問題におけるさまざまな畳み込みアーキテクチャの機能の広範な分析が実行されました。さらに、パラメータ化の変化に対するロバスト性を決定するために、連続問題に対して提案されたモデルの広範な実験が実行されました。モデルの一般化能力は、英語のIAM、フランス語のRIMES、スペイン語のOsborneの3つの異なる言語を使用して3つの手書きテキストデータベースで評価することによっても検証されています。新しく提案されたモデルは、他の確立された方法論で得られた結果と競争力のある結果を提供します。

Handwritten text recognition is an open problem of great interest in the area of automatic document image analysis. The transcription of handwritten content present in digitized documents is significant in analyzing historical archives or digitizing information from handwritten documents, forms, and communications. In the last years, great advances have been made in this area due to applying deep learning techniques to its resolution. This Thesis addresses the offline continuous handwritten text recognition (HTR) problem, consisting of developing algorithms and models capable of transcribing the text present in an image without the need for the text to be segmented into characters. For this purpose, we have proposed a new recognition model based on integrating two types of deep learning architectures: convolutional neural networks (CNN) and sequence-to-sequence (seq2seq) models, respectively. The convolutional component of the model is oriented to identify relevant features present in characters, and the seq2seq component builds the transcription of the text by modeling the sequential nature of the text. For the design of this new model, an extensive analysis of the capabilities of different convolutional architectures in the simplified problem of isolated character recognition has been carried out in order to identify the most suitable ones to be integrated into the continuous model. Additionally, extensive experimentation of the proposed model for the continuous problem has been carried out to determine its robustness to changes in parameterization. The generalization capacity of the model has also been validated by evaluating it on three handwritten text databases using different languages: IAM in English, RIMES in French, and Osborne in Spanish, respectively. The new proposed model provides competitive results with those obtained with other well-established methodologies.

updated: Sun Dec 26 2021 07:31:03 GMT+0000 (UTC)

published: Sun Dec 26 2021 07:31:03 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト