Three-stage binarization of color document images based on discrete wavelet transform and generative adversarial networks

Yu-Shian Lin; Rui-Yang Ju; Chih-Chia Chen; Chun-Tse Chien; Jen-Shiun Chiang

離散ウェーブレット変換と敵対的生成ネットワークに基づくカラー文書画像の 3 段階二値化

劣化したカラー文書画像において前景のテキスト情報を背景から効率的に分離することは、古文書の保存における重要な課題です。古文書は長期間の保存が不完全であるため、汚れ、黄ばみ、インクのにじみなどのさまざまな劣化が生じ、画像の二値化結果に大きな影響を与えます。この研究では、離散ウェーブレット変換 (DWT) を通じて劣化したカラー文書画像を強化および二値化するための、敵対的生成ネットワーク (GAN) を使用する 3 段階の方法を提案します。ステージ 1 では、DWT を適用し、画像強調のために Low-Low (LL) サブバンドイメージを保持します。ステージ 2 では、元の入力画像が 4 つの単一チャネル画像 (赤、緑、青、グレー) に分割され、それぞれが独立した敵対的ネットワークでトレーニングされて前景の色情報が抽出されます。 Stage-3 では、Stage-2 からの出力画像と元の入力画像を使用して、ドキュメントの二値化のための独立した敵対的ネットワークをトレーニングし、グローバル機能とローカル機能の統合を可能にします。実験結果は、私たちが提案した方法が、Document Image Binarization Contest (DIBCO) データセットに対して他の古典的および最先端 (SOTA) 方法よりも優れていることを示しています。実装コードを https://github.com/abcpp12383/ThreeStageBinarization でリリースしました。

The efficient segmentation of foreground text information from the background in degraded color document images is a critical challenge in the preservation of ancient manuscripts. The imperfect preservation of ancient manuscripts over time has led to various types of degradation, such as staining, yellowing, and ink seepage, significantly affecting image binarization results. This work proposes a three-stage method using Generative Adversarial Networks (GAN) for enhancing and binarizing degraded color document images through Discrete Wavelet Transform (DWT). Stage-1 involves applying DWT and retaining the Low-Low (LL) subband images for image enhancement. In Stage-2, the original input image is divided into four single-channel images (Red, Green, Blue, and Gray), and each is trained with independent adversarial networks to extract color foreground information. In Stage-3, the output image from Stage-2 and the original input image are used to train independent adversarial networks for document binarization, enabling the integration of global and local features. The experimental results demonstrate that our proposed method outperforms other classic and state-of-the-art (SOTA) methods on the Document Image Binarization Contest (DIBCO) datasets. We have released our implementation code at https://github.com/abcpp12383/ThreeStageBinarization.

updated: Mon Aug 28 2023 14:03:09 GMT+0000 (UTC)

published: Tue Nov 29 2022 11:17:34 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト