CSTR: A Classification Perspective on Scene Text Recognition

Hongxiang Cai; Jun Sun; Yichao Xiong

CSTR：シーンテキスト認識に関する分類の視点

シーンテキスト認識の一般的な視点は、シーケンスからシーケンス（seq2seq）およびセグメンテーションです。本論文では、シーンテキスト認識の新しい視点を提案し、シーンテキスト認識を画像分類問題としてモデル化する。画像分類の観点に基づいて、CSTRと名付けられたシーンテキスト認識モデルが提案されます。 CSTRモデルは、一連の畳み込み層と最後のグローバル平均プーリング層で構成され、その後に独立したマルチクラス分類ヘッドが続きます。各ヘッドは、入力画像内の単語シーケンスの対応する文字を予測します。 CSTRモデルは、並列クロスエントロピー損失を使用して簡単にトレーニングできます。 CSTRは、ResNet he2016deepのような画像分類モデルと同じくらい単純で、実装が簡単です。また、完全畳み込みニューラルネットワークアーキテクチャにより、トレーニングと展開が効率的になります。徹底的な実験により、シーンテキスト認識における分類パースペクティブの有効性を示します。さらに、CSTRは、通常のテキスト、不規則なテキストを含む6つの公開ベンチマークでほぼ最先端のパフォーマンスを実現します。コードはhttps://github.com/Media-Smart/vedastrで入手できます。

The prevalent perspectives of scene text recognition are from sequence to sequence (seq2seq) and segmentation. In this paper, we propose a new perspective on scene text recognition, in which we model the scene text recognition as an image classification problem. Based on the image classification perspective, a scene text recognition model is proposed, which is named as CSTR. The CSTR model consists of a series of convolutional layers and a global average pooling layer at the end, followed by independent multi-class classification heads, each of which predicts the corresponding character of the word sequence in input image. The CSTR model is easy to train using parallel cross entropy losses. CSTR is as simple as image classification models like ResNet he2016deep which makes it easy to implement, and the fully convolutional neural network architecture makes it efficient to train and deploy. We demonstrate the effectiveness of the classification perspective on scene text recognition with thorough experiments. Futhermore, CSTR achieves nearly state-of-the-art performance on six public benchmarks including regular text, irregular text. The code will be available at https://github.com/Media-Smart/vedastr.

updated: Mon Feb 22 2021 10:32:07 GMT+0000 (UTC)

published: Mon Feb 22 2021 10:32:07 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト