Three-stage Binarization of Color Document Images Based on Discrete Wavelet Transform and Generative Adversarial Networks
學年 113
學期 1
出版(發表)日期 2024-11-01
作品名稱 Three-stage Binarization of Color Document Images Based on Discrete Wavelet Transform and Generative Adversarial Networks
作品名稱(其他語言)
著者 [3] Rui-Yang Ju, Yu-Shian Lin, Yanlin Jin, Chih-Chia Chen, Chun-Tse Chien, Jen-Shiun Chiang
單位
出版者
著錄名稱、卷期、頁數 Knowledge-Based Systems, vol. 304
摘要 The efficient extraction of text information from the background in degraded color document images is an important challenge in the preservation of ancient manuscripts. The imperfect preservation of ancient manuscripts has led to different types of degradation over time, such as page yellowing, staining, and ink bleeding, seriously affecting the results of document image binarization. This work proposes an effective three-stage network method to image enhancement and binarization of degraded documents using generative adversarial networks (GANs). Specifically, in Stage-1, we first split the input images into multiple patches, and then split these patches into four single-channel patch images (gray, red, green, and blue). Then, three single-channel patch images (red, green, and blue) are processed by the discrete wavelet transform (DWT) with normalization. In Stage-2, we use four independent generators to separately train GAN models based on the four channels on the processed patch images to extract color foreground information. Finally, in Stage-3, we train two independent GAN models on the outputs of Stage-2 and the resized original input images (512 × 512) as the local and global predictions to obtain the final outputs. The experimental results show that the Avg-Score metrics of the proposed method are 77.64, 77.95, 79.05, 76.38, 75.34, and 77.00 on the (H)-DIBCO 2011, 2013, 2014, 2016, 2017, and 2018 datasets, which are at the state-of-the-art level. The implementation code for this work is available at https://github.com/abcpp12383/ThreeStageBinarization
關鍵字 Deep learning, Computer vision, Discrete wavelet transform, Generative adversarial networks, Document image processing, Document image enhancement, Document image binarization
語言 en
ISSN
期刊性質 國外
收錄於 SCI
產學合作
通訊作者 Jen-Shiun Chiang
審稿制度 1
國別 GBR
公開徵稿
出版型式 ,電子版
SDGS 優質教育