HCR-Net: A deep learning based script independent handwritten character recognition network

Vinod Kumar Chauhan; Sukhdeep Singh; Anuj Sharma

HCR-Net：ディープラーニングベースのスクリプトに依存しない手書き文字認識ネットワーク

手書き文字認識（HCR）は、主に文字の構造の類似性、さまざまな手書きスタイル、ノイズの多いデータセット、および多種多様な言語とスクリプトが原因で、パターン認識における困難な学習問題です。 HCRの問題は数十年にわたって広く研究されていますが、スクリプトに依存しないモデルに関する研究は非常に限られています。これは、スクリプトの多様性、言語/スクリプト固有で常に利用できるとは限らない手作りの特徴抽出技術に関する従来の研究努力のほとんどの焦点、結果を再現するための公開データセットとコードが利用できないなどの要因によるものです。一方、ディープラーニングは、HCRを含むパターン認識のさまざまな分野で大きな成功を収めており、エンドツーエンドの学習、つまり自動化された特徴抽出と認識を提供します。この論文では、HCR-Netと呼ばれる、スクリプトに依存しない手書き文字認識のためのエンドツーエンド学習のための転送学習と画像拡張を活用する新しい深層学習アーキテクチャを提案しました。このネットワークは、HCRの新しい転送学習アプローチに基づいており、事前にトレーニングされたVGG16ネットワークの下位層の一部が利用されます。転移学習と画像拡張により、HCR-Netはより高速なトレーニング、より優れたパフォーマンス、より優れた一般化を提供します。ベンガル語、パンジャブ語、ヒンディー語、英語、スウェーデン語、ウルドゥー語、ファルシ語、チベット語、カンナダ語、マラヤーラム語、テルグ語、マラーティー語、ネパール語、アラビア語の公開されているデータセットでの実験結果は、HCR-Netの有効性を証明し、いくつかの新しいベンチマークを確立します。結果の再現性とHCR研究の進歩のために、完全なコードがhttps://github.com/jmdvinodjmd/HCR-NetGitHubで公開されています。

Handwritten character recognition (HCR) is a challenging learning problem in pattern recognition, mainly due to similarity in structure of characters, different handwriting styles, noisy datasets and a large variety of languages and scripts. HCR problem is studied extensively for a few decades but there is very limited research on script independent models. This is because of factors, like, diversity of scripts, focus of the most of conventional research efforts on handcrafted feature extraction techniques which are language/script specific and are not always available, and unavailability of public datasets and codes to reproduce the results. On the other hand, deep learning has witnessed huge success in different areas of pattern recognition, including HCR, and provides end-to-end learning, i.e., automated feature extraction and recognition. In this paper, we have proposed a novel deep learning architecture which exploits transfer learning and image-augmentation for end-to-end learning for script independent handwritten character recognition, called HCR-Net. The network is based on a novel transfer learning approach for HCR, where some of lower layers of a pre-trained VGG16 network are utilised. Due to transfer learning and image-augmentation, HCR-Net provides faster training, better performance and better generalisations. The experimental results on publicly available datasets of Bangla, Punjabi, Hindi, English, Swedish, Urdu, Farsi, Tibetan, Kannada, Malayalam, Telugu, Marathi, Nepali and Arabic languages prove the efficacy of HCR-Net and establishes several new benchmarks. For reproducibility of the results and for the advancements of the HCR research, complete code is publicly released at https://github.com/jmdvinodjmd/HCR-NetGitHub.

updated: Sun Aug 15 2021 05:48:07 GMT+0000 (UTC)

published: Sun Aug 15 2021 05:48:07 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト