BiNet: Degraded-Manuscript Binarization in Diverse Document Textures and Layouts using Deep Encoder-Decoder Networks

Maruf A. Dhali; Jan Willem de Wit; Lambert Schomaker

BiNet：Deep Encoder-Decoder Networksを使用した多様なドキュメントテクスチャとレイアウトでの劣化した原稿の二値化

手書き文書画像の二値化は、インクピクセルと背景ピクセルを区別するためのセマンティックセグメンテーションプロセスです。これは、文字認識、作家の識別、およびスクリプトスタイルの進化分析に向けた重要な手順の1つです。書き込みスタイル、インク、および紙の素材の多様性のために、2値化タスク自体は困難です。文書の経年劣化と劣化により、歴史的な原稿にとってはさらに困難です。そのような原稿の1つは、死海文書（DSS）の画像コレクションです。これは、既存の2値化技術にとって非常に難しい課題です。この記事では、ディープエンコーダー/デコーダーネットワークを使用したDSS画像の新しい2値化手法を提案します。ここで提案する人工ニューラルネットワークは、主にDSS画像を2値化するように設計されていますが、さまざまな原稿コレクションでトレーニングすることもできます。さらに、転送学習を使用することで、ネットワークは広範囲の手書き文書にすでに利用できるようになり、2値化のためのユニークな多目的ツールになります。手書き原稿画像の二値化競争（H-DIBCOおよびDIBCO）の履歴原稿とデータセットの両方を使用した定性的結果といくつかの定量的比較は、システムの堅牢性と有効性を示します。ここで提案されている最高のパフォーマンスのネットワークアーキテクチャは、U-Netエンコーダデコーダのバリアントです。

Handwritten document-image binarization is a semantic segmentation process to differentiate ink pixels from background pixels. It is one of the essential steps towards character recognition, writer identification, and script-style evolution analysis. The binarization task itself is challenging due to the vast diversity of writing styles, inks, and paper materials. It is even more difficult for historical manuscripts due to the aging and degradation of the documents over time. One of such manuscripts is the Dead Sea Scrolls (DSS) image collection, which poses extreme challenges for the existing binarization techniques. This article proposes a new binarization technique for the DSS images using the deep encoder-decoder networks. Although the artificial neural network proposed here is primarily designed to binarize the DSS images, it can be trained on different manuscript collections as well. Additionally, the use of transfer learning makes the network already utilizable for a wide range of handwritten documents, making it a unique multi-purpose tool for binarization. Qualitative results and several quantitative comparisons using both historical manuscripts and datasets from handwritten document image binarization competition (H-DIBCO and DIBCO) exhibit the robustness and the effectiveness of the system. The best performing network architecture proposed here is a variant of the U-Net encoder-decoders.

updated: Wed Nov 13 2019 20:12:35 GMT+0000 (UTC)

published: Wed Nov 13 2019 20:12:35 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト