Three-stage binarization of color document images based on discrete wavelet transform and generative adversarial networks

Yu-Shian Lin; Rui-Yang Ju; Chih-Chia Chen; Ting-Yu Lin; Jen-Shiun Chiang

離散ウェーブレット変換と敵対的生成ネットワークに基づくカラードキュメント画像の 3 段階の 2 値化

劣化したカラー文書画像の背景から前景のテキスト情報を効率的に分割することは、注目の研究テーマです。古文書は長期にわたる不完全な保存により、汚れ、黄ばみ、インクのにじみなどのさまざまな劣化が、画像の二値化の結果に深刻な影響を与えてきました。本論文では,離散ウェーブレット変換(DWT)と敵対的生成ネットワーク(GAN)を用いて劣化したカラー文書画像の画像強調と2値化のための3段階法を提案した。ステージ 1 では、DWT を使用し、LL サブバンド画像を保持して画像強調を実現します。ステージ 2 では、元の入力画像が 4 つ (赤、緑、青、グレー) の単一チャネル画像に分割され、それぞれが独立した敵対的ネットワークをトレーニングします。トレーニング済みの敵対的ネットワークモデルを使用して、画像から色の前景情報を抽出します。ステージ 3 では、グローバル機能とローカル機能を組み合わせるために、ステージ 2 からの出力画像と元の入力画像を使用して、ドキュメントの二値化のために独立した敵対的ネットワークをトレーニングします。実験結果は、提案された方法が、Document Image Binarization Contest (DIBCO) データセットに対する多くの古典的および最先端の (SOTA) 方法よりも優れていることを示しています。 https://github.com/abcpp12383/ThreeStageBinarization で実装コードをリリースしています。

The efficient segmentation of foreground text information from the background in degraded color document images is a hot research topic. Due to the imperfect preservation of ancient documents over a long period of time, various types of degradation, including staining, yellowing, and ink seepage, have seriously affected the results of image binarization. In this paper, a three-stage method is proposed for image enhancement and binarization of degraded color document images by using discrete wavelet transform (DWT) and generative adversarial network (GAN). In Stage-1, we use DWT and retain the LL subband images to achieve the image enhancement. In Stage-2, the original input image is split into four (Red, Green, Blue and Gray) single-channel images, each of which trains the independent adversarial networks. The trained adversarial network models are used to extract the color foreground information from the images. In Stage-3, in order to combine global and local features, the output image from Stage-2 and the original input image are used to train the independent adversarial networks for document binarization. The experimental results demonstrate that our proposed method outperforms many classical and state-of-the-art (SOTA) methods on the Document Image Binarization Contest (DIBCO) dataset. We release our implementation code at https://github.com/abcpp12383/ThreeStageBinarization.

updated: Sat Dec 17 2022 04:42:03 GMT+0000 (UTC)

published: Tue Nov 29 2022 11:17:34 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト