ICDAR 2023 Video Text Reading Competition for Dense and Small Text

Weijia Wu; Yuzhong Zhao; Zhuang Li; Jiahong Li; Mike Zheng Shou; Umapada Pal; Dimosthenis Karatzas; Xiang Bai

ICDAR 2023 ビデオテキストリーディングコンテスト

最近、自然シーンにおけるビデオテキストの検出、追跡、および認識が、コンピュータービジョンコミュニティで非常に人気が高まっています。ただし、ほとんどの既存のアルゴリズムとベンチマークは、一般的なテキストのケース (通常のサイズ、密度など) と単一のシナリオに焦点を当てており、極端なビデオテキストの課題 (さまざまなシナリオでの密集した小さなテキストなど) は無視しています。このコンペティションレポートでは、さまざまなシナリオでビデオ内の高密度および小さなテキストの読み取りの課題に焦点を当てた、ビデオテキストの読み取りベンチマークである DSText を確立します。以前のデータセットと比較して、提案されたデータセットには主に 3 つの新しい課題が含まれています。1) 高密度のビデオテキスト、ビデオテキストスポッターの新しい課題。 2) 比率の高い小さなテキスト。 3) ゲーム、スポーツなど、さまざまな新しいシナリオ。提案された DSText には、12 のオープンシナリオからの 100 のビデオクリップが含まれており、2 つのタスク (ビデオテキストトラッキング (タスク 1) とエンドツーエンドのビデオテキストスポッティング (タスク 1)) をサポートしています。 2))。コンペティション期間中 (2023 年 2 月 15 日に開始、2023 年 3 月 20 日に終了)、合計 24 チームが 3 つの提案されたタスクに参加し、それぞれ約 30 の有効な提出物がありました。この記事では、データセットの詳細な統計情報、タスク、評価プロトコル、DSText コンテストの ICDAR 2023 の結果概要について説明します。さらに、このベンチマークがコミュニティでのビデオテキスト研究を約束するものになることを願っています。

Recently, video text detection, tracking, and recognition in natural scenes are becoming very popular in the computer vision community. However, most existing algorithms and benchmarks focus on common text cases (e.g., normal size, density) and single scenarios, while ignoring extreme video text challenges, i.e., dense and small text in various scenarios. In this competition report, we establish a video text reading benchmark, DSText, which focuses on dense and small text reading challenges in the video with various scenarios. Compared with the previous datasets, the proposed dataset mainly include three new challenges: 1) Dense video texts, a new challenge for video text spotter. 2) High-proportioned small texts. 3) Various new scenarios, e.g., Game, sports, etc. The proposed DSText includes 100 video clips from 12 open scenarios, supporting two tasks (i.e., video text tracking (Task 1) and end-to-end video text spotting (Task 2)). During the competition period (opened on 15th February 2023 and closed on 20th March 2023), a total of 24 teams participated in the three proposed tasks with around 30 valid submissions, respectively. In this article, we describe detailed statistical information of the dataset, tasks, evaluation protocols and the results summaries of the ICDAR 2023 on DSText competition. Moreover, we hope the benchmark will promise video text research in the community.

updated: Mon Apr 10 2023 04:20:34 GMT+0000 (UTC)

published: Mon Apr 10 2023 04:20:34 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト