CONSENT: Context Sensitive Transformer for Bold Words Classification

Ionut-Catalin Sandu; Daniel Voinea; Alin-Ionut Popa

同意：太字の単語分類のためのコンテキストセンシティブトランスフォーマー

完全にトレーニング可能なエンドツーエンドの深層学習パイプライン内でコンテキスト依存のオブジェクト分類を行うための、シンプルでありながら効果的なCONtextSENsitiveTransformerフレームワークであるCONSENTを紹介します。最先端の結果を証明する大胆な単語検出のタスクに関する提案されたフレームワークを例示します。さまざまな程度の照明、角度の歪み、スケールの変化の下で撮影された、未知のフォントタイプ（Arial、Calibri、Helveticaなど）、未知の言語のテキストを含む画像が与えられた場合、すべての単語を抽出し、コンテキスト依存のバイナリ分類を学習します（つまり、太字と非太字）エンドツーエンドのトランスベースのニューラルネットワークアンサンブルを使用します。フレームワークの拡張性を証明するために、手のポーズを描いた2つの写真でシーケンスを与えられた勝者を決定するようにモデルをトレーニングすることにより、じゃんけんゲームの最先端との競争力のある結果を示します。

We present CONSENT, a simple yet effective CONtext SENsitive Transformer framework for context-dependent object classification within a fully-trainable end-to-end deep learning pipeline. We exemplify the proposed framework on the task of bold words detection proving state-of-the-art results. Given an image containing text of unknown font-types (e.g. Arial, Calibri, Helvetica), unknown language, taken under various degrees of illumination, angle distortion and scale variation, we extract all the words and learn a context-dependent binary classification (i.e. bold versus non-bold) using an end-to-end transformer-based neural network ensemble. To prove the extensibility of our framework, we demonstrate competitive results against state-of-the-art for the game of rock-paper-scissors by training the model to determine the winner given a sequence with 2 pictures depicting hand poses.

updated: Mon May 16 2022 13:50:33 GMT+0000 (UTC)

published: Mon May 16 2022 13:50:33 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト