TransDocs: Optical Character Recognition with word to word translation

Abhishek Bamotra; Phani Krishna Uppala

TransDocs: 単語から単語への翻訳による光学式文字認識

OCR はさまざまなアプリケーションで使用されていますが、その出力は常に正確であるとは限らず、不適切な単語が発生する可能性があります。この研究では、光学式文字認識 (OCR) を ML 技術で改善し、OCR と長短期記憶 (LSTM) ベースのシーケンスツーシーケンスディープラーニングモデルを統合して文書翻訳を実行することに焦点を当てています。この作業は、英語からスペイン語への翻訳用の ANKI データセットに基づいています。この作品では、機械翻訳に注目した LSTM ベースの seq2seq アーキテクチャを使用した深層学習モデルを使用しながら、事前学習済みの OCR の比較研究を示しました。モデルのエンドツーエンドのパフォーマンスは、BLEU-4 スコアで表現されています。この研究論文は、OCR とその文書翻訳への応用に関心のある研究者と実務家を対象としています。

While OCR has been used in various applications, its output is not always accurate, leading to misfit words. This research work focuses on improving the optical character recognition (OCR) with ML techniques with integration of OCR with long short-term memory (LSTM) based sequence to sequence deep learning models to perform document translation. This work is based on ANKI dataset for English to Spanish translation. In this work, I have shown comparative study for pre-trained OCR while using deep learning model using LSTM-based seq2seq architecture with attention for machine translation. End-to-end performance of the model has been expressed in BLEU-4 score. This research paper is aimed at researchers and practitioners interested in OCR and its applications in document translation.

updated: Sat Apr 15 2023 21:40:14 GMT+0000 (UTC)

published: Sat Apr 15 2023 21:40:14 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト