Noise Characterization in Ancient Document Images Based on DCT Coefficient Distribution
DOI :
Date : 2015
Ancient document images date back to several hundred years are commonly suffered from noises and degradations, such as ink-seeping from the back page, 'fox; that is local-brown discolorations of paper, text fading, background spots, uneven background and so on. Noise reduction (or denoising) is an important step in document image processing, because the step can enhance the optical character recognition (OCR) performance. Prior to employing a noise reduction algorithm, it is important to characterize noise types exist in the document. This paper proposes a method to characterize noise types exist in ancient document based on the DCT coefficient distribution of the image. The characterization are accomplished by analyzing the standard deviation of distribution of DCT coefficient higher frequency-band of cropped (localized) noise image. In simulations, three noise types exist in Acehnese ancient documents namely 'fox', spots, and uneven background are characterized using the proposed method. The results suggest that the DCT coefficient distributions can be used to characterize the noises in ancient document. In addition, it has been shown that the proposed method can be used for document image classification.