CM-Net: Concentric Mask based Arbitrary-Shaped Text Detection

Chuang Yang; Mulin Chen; Zhitong Xiong; Yuan Yuan; Qi Wang

CM-Net：同心マスクベースの任意の形状のテキスト検出

最近、任意の形の高速テキスト検出が魅力的な研究トピックになっています。ただし、ほとんどの既存の方法は非リアルタイムであり、インテリジェントシステムでは不十分な場合があります。いくつかのリアルタイムテキスト方式が提案されていますが、検出精度は非リアルタイム方式よりもはるかに遅れています。検出精度と速度を同時に向上させるために、新しい高速で正確なテキスト検出フレームワーク、つまり新しいテキスト表現方法と多視点機能（MPF）モジュールに基づいて構築されたCM-Netを提案します。前者は、同心マスク（CM）により、任意の形状のテキスト輪郭を効率的かつ堅牢な方法で適合させることができます。後者は、ネットワークが複数の観点からより多くのCM関連の識別機能を学習することを奨励し、余分な計算コストをもたらしません。 CMとMPFの利点を活用して、提案されたCM-Netは、テキストインスタンスの1つのCMを予測するだけでテキストの輪郭を再構築し、以前の作業と比較して検出精度と速度の最適なバランスを実現します。さらに、多視点機能が効果的に学習されることを保証するために、多因子制約損失が提案されます。広範な実験は、提案されたCMが任意の形状のテキストインスタンスに適合するために効率的かつ堅牢であることを示し、また、識別可能なテキスト特徴認識のためのMPFおよび制約損失の有効性を検証します。さらに、実験結果は、提案されたCM-Netが、MSRA-TD500、CTW1500、Total-Text、およびICDAR2015での検出速度と精度の両方において、既存の最先端（SOTA）のリアルタイムテキスト検出方法よりも優れていることを示しています。データセット。

Recently fast arbitrary-shaped text detection has become an attractive research topic. However, most existing methods are non-real-time, which may fall short in intelligent systems. Although a few real-time text methods are proposed, the detection accuracy is far behind non-real-time methods. To improve the detection accuracy and speed simultaneously, we propose a novel fast and accurate text detection framework, namely CM-Net, which is constructed based on a new text representation method and a multi-perspective feature (MPF) module. The former can fit arbitrary-shaped text contours by concentric mask (CM) in an efficient and robust way. The latter encourages the network to learn more CM-related discriminative features from multiple perspectives and brings no extra computational cost. Benefiting the advantages of CM and MPF, the proposed CM-Net only needs to predict one CM of the text instance to rebuild the text contour and achieves the best balance between detection accuracy and speed compared with previous works. Moreover, to ensure that multi-perspective features are effectively learned, the multi-factor constraints loss is proposed. Extensive experiments demonstrate the proposed CM is efficient and robust to fit arbitrary-shaped text instances, and also validate the effectiveness of MPF and constraints loss for discriminative text features recognition. Furthermore, experimental results show that the proposed CM-Net is superior to existing state-of-the-art (SOTA) real-time text detection methods in both detection speed and accuracy on MSRA-TD500, CTW1500, Total-Text, and ICDAR2015 datasets.

updated: Sat Jan 22 2022 03:53:15 GMT+0000 (UTC)

published: Mon Nov 30 2020 11:54:14 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト