The Binarization for Color Text Images Based on Graph-theoretical Clustering and Binary Texture Analysis[J]. Journal of Image and Graphics, 2004, 9(3): 290. DOI: 10.11834/jig.20040353.
The Binarization for Color Text Images Based on Graph-theoretical Clustering and Binary Texture Analysis
especially for information retrieval applications. In this paper
the authors have developed a novel algorithm for text background separation
or binarization for color images of complicated backgrounds. In their algorithm
dimensionality reduction and graph theoretical clustering are first performed. Corresponding to each cluster
a binary image can be obtained. Additional binary images are obtained through combination among these cluster related binary images. Then
two kinds of features capable of effectively characterizing binary texture images
run length histogram based and spatial size distribution based features associated with each of these binary images are extracted out. Based on the analysis of these texture features
cooperating with an LDA classifier
the optimal binary image which gives the best text background separation will be found out as the final binarization result. Experiments with images collected from Internet have been carried out
which show that their method can handle color text images with complex background effectively; comparison with existing techniques also presented a notable improvement brought by the proposed method.