Efficient and Accurate Scene Text Detection with Low-Rank Approximation Network

Yuchen Su

低ランク近似ネットワークによる効率的かつ正確なシーンテキスト検出

最近、テキストの位置を特定するためのパラメータ曲線を予測する回帰ベースの方法が、シーンのテキスト検出で一般的です。ただし、これらの方法は簡潔な構造と高速な後処理のバランスを取るのに苦労しており、既存のパラメータ曲線は依然として任意の形状のテキストのモデリングには理想的ではなく、速度と精度のバランスが課題となっています。これらの課題に取り組むために、我々はまず、陽性サンプルに対するデュアルマッチングスキームを提案します。これは、スパースマッチングスキームを通じて推論速度を加速し、デンスマッチングスキームを通じてモデルの収束を加速します。次に、異なるテキスト輪郭間の形状相関を利用することによる、低ランク近似に基づく、完全でコンパクト、単純かつ堅牢な新しいテキスト輪郭表現方法を提案します。これらの設計に基づいて、LRANet という名前の効率的かつ正確な任意形状のテキスト検出器を実装します。 3 つの困難なデータセットに対して広範な実験が行われ、最先端の手法に対する LRANet の精度と効率が実証されました。コードは近日公開予定です。

Recently, regression-based methods, which predict parameter curves for localizing texts, are popular in scene text detection. However, these methods struggle to balance concise structure and fast post-processing, and the existing parameter curves are still not ideal for modeling arbitrary-shaped texts, leading to a challenge in balancing speed and accuracy. To tackle these challenges, we firstly propose a dual matching scheme for positive samples, which accelerates inference speed through sparse matching scheme and accelerates model convergence through dense matching scheme. Then, we propose a novel text contour representation method based on low-rank approximation by exploiting the shape correlation between different text contours, which is complete, compact, simplicity and robustness. Based on these designs, we implement an efficient and accurate arbitrary-shaped text detector, named LRANet. Extensive experiments are conducted on three challenging datasets, which demonstrate the accuracy and efficiency of our LRANet over state-of-the-art methods. The code will be released soon.

updated: Tue Jun 27 2023 02:03:46 GMT+0000 (UTC)

published: Tue Jun 27 2023 02:03:46 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト