1st Place Solution to ICDAR 2021 RRC-ICTEXT End-to-end Text Spotting and Aesthetic Assessment on Integrated Circuit

Qiyao Wang; Pengfei Li; Li Zhu; Yi Niu

ICDAR 2021RRCの1位ソリューション-ICTEXT集積回路でのエンドツーエンドのテキストスポッティングと美的評価

このペーパーでは、ICDAR 2021ロバストリーディングチャレンジ-集積回路テキストスポッティングと美的評価（ICDAR RRC-ICTEXT 2021）に提案された方法を紹介します。テキストスポッティングタスクでは、集積回路上の文字を検出し、yolov5検出モデルに基づいて分類します。 SynthText、生成されたデータ、およびデータサンプラーを使用して、小文字と非小文字のバランスを取ります。モデルの精度をさらに向上させるために、半教師ありアルゴリズムと蒸留を採用しています。美的評価タスクでは、3つのクラスの分類ブランチを追加して、各キャラクターの美的クラスを区別します。最後に、NVIDIA Tensorrtに基づいて、推論速度を加速し、メモリ消費を削減するためのモデル展開を行います。私たちの方法は、タスク3.1で31 FPSと306Mメモリ（ランク1）で59.1 mAPを達成し、タスク3.2で30 FPSと306Mメモリ（ランク1）で78.7％のF2スコアを達成します。

This paper presents our proposed methods to ICDAR 2021 Robust Reading Challenge - Integrated Circuit Text Spotting and Aesthetic Assessment (ICDAR RRC-ICTEXT 2021). For the text spotting task, we detect the characters on integrated circuit and classify them based on yolov5 detection model. We balance the lowercase and non-lowercase by using SynthText, generated data and data sampler. We adopt semi-supervised algorithm and distillation to furtherly improve the model's accuracy. For the aesthetic assessment task, we add a classification branch of 3 classes to differentiate the aesthetic classes of each character. Finally, we make model deployment to accelerate inference speed and reduce memory consumption based on NVIDIA Tensorrt. Our methods achieve 59.1 mAP on task 3.1 with 31 FPS and 306M memory (rank 1), 78.7% F2 score on task 3.2 with 30 FPS and 306M memory (rank 1).

updated: Thu Apr 08 2021 06:52:49 GMT+0000 (UTC)

published: Thu Apr 08 2021 06:52:49 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト