回归—聚类联合框架下的手写文本行提取
Combination of regression and clustering for handwritten text line extraction
- 2018年23卷第8期 页码:1207-1217
收稿:2017-12-12,
修回:2018-3-7,
纸质出版:2018-08-16
DOI: 10.11834/jig.170624
移动端阅览

浏览全部资源
扫码关注微信
收稿:2017-12-12,
修回:2018-3-7,
纸质出版:2018-08-16
移动端阅览
目的
2
手写文本行提取是文档图像处理中的重要基础步骤,对于无约束手写文本图像,文本行都会有不同程度的倾斜、弯曲、交叉、粘连等问题。利用传统的几何分割或聚类的方法往往无法保证文本行边缘的精确分割。针对这些问题提出一种基于文本行回归-聚类联合框架的手写文本行提取方法。
方法
2
首先,采用各向异性高斯滤波器组对图像进行多尺度、多方向分析,利用拖尾效应检测脊形结构提取文本行主体区域,并对其骨架化得到文本行回归模型。然后,以连通域为基本图像单元建立超像素表示,为实现超像素的聚类,建立了像素-超像素-文本行关联层级随机场模型,利用能量函数优化的方法实现超像素的聚类与所属文本行标注。在此基础上,检测出所有的行间粘连字符块,采用基于回归线的k-means聚类算法由回归模型引导粘连字符像素聚类,实现粘连字符分割与所属文本行标注。最后,利用文本行标签开关实现了文本行像素的操控显示与定向提取,而不再需要几何分割。
结果
2
在HIT-MW脱机手写中文文档数据集上进行文本行提取测试,检测率DR为99.83%,识别准确率RA为99.92%。
结论
2
实验表明,提出的文本行回归-聚类联合分析框架相比于传统的分段投影分析、最小生成树聚类、Seam Carving等方法提高了文本行边缘的可控性与分割精度。在高效手写文本行提取的同时,最大程度地避免了相邻文本行的干扰,具有较高的准确率和鲁棒性。
Objective
2
Handwritten text line extraction is fundamental in document image processing. The text lines may suffer from tilting curving crossing and adhesion because of unconstrained paper layout and free writing style. Traditional text line segmentation or clustering method cannot guarantee the classification accuracy of the pixels between text lines. In this study
a text line regression-clustering joint framework for handwritten text line extraction is proposed.
Method
2
First
the anisotropic Gaussian filter bank is used to filter the handwritten document image in multiple scales and directions. The main body area (MBA) of text line is first extracted by smearing
andthe text line regression model is then obtained by extracting the skeleton structure of the MBA. Second
the super-pixel representation is constructed with connected component as the basic image element. For super-pixel classification and clustering
an approach based on associative hierarchical random fields is presented. A higher-order energy model is established by constructing a hierarchical network of pixel-connected component text lines. On the basis of the model
an energy function is built whose minimization yields the text line labels of the connected components. With the achieved instance labels of connected components as basis
the sticky characters that share the same label are detected. Third
the pixels of the sticky characters are re-clustered with k-means algorithm under the constraint of the text line regression model. With the instance labels of text lines
the manipulation of the text lines can be achieved by label switch. Therefore
the geometric segmentation of the document image is no longer needed
and the bounding box can be used to extract text line directly.
Result
2
Experiments were performed on HIT-MW document level dataset. The proposed framework achieved an overall detection rate of 99.83% and recognition accuracy of 99.92% which reach to the state-of-the-art performance for Chinese handwritten text line extraction.
Conclusion
2
Experimental results show that the proposed text line regression-clustering joint framework improves the segmentation accuracy in pixel levels and makes the edge of the text line more controllable than traditional algorithms
such as piecewise projection
minimum spanning tree-based clustering
and seam carving. The proposed system exhibits high performance on Chinese handwritten text line extraction together with enhanced robustness and accuracy without interference of adjacent text lines.
Neudecker C, Antonacopoulos A. Making Europe's historical newspapers searchable[C ] //Proceedings of the 12th IAPR Workshop on Document Analysis Systems. Santorini, Greece: IEEE, 2016: 405-410. [ DOI: 10.1109/DAS.2016.83 http://dx.doi.org/10.1109/DAS.2016.83 ]
Mehri M, Gomez-Krämer P, Héroux P, et al. A texture-based pixel labeling approach for historical books[J]. Pattern Analysis and Applications, 2017, 20(2):325-364.[DOI:10.1007/s10044-015-0451-9]
Murdock M, Reid S, Hamilton B, et al. ICDAR 2015 competition on text line detection in historical documents[C ] //Proceedings of the 13th International Conference on Document Analysis and Recognition. Tunis, Tunisia: IEEE, 2015: 1171-1175. [ DOI: 10.1109/ICDAR.2015.7333945 http://dx.doi.org/10.1109/ICDAR.2015.7333945 ]
Stamatopoulos N, Gatos B, Louloudis G, et al. ICDAR 2013 handwriting segmentation contest[C ] //Proceedings of the 12th International Conference on Document Analysis and Recognition. Washington, DC, USA: IEEE, 2013: 1402-1406. [ DOI: 10.1109/ICDAR.2013.283 http://dx.doi.org/10.1109/ICDAR.2013.283 ]
Gatos B, Stamatopoulos N, Louloudis G. ICFHR 2010 handwriting segmentation contest[C ] //Proceedings of the 12th International Conference on Frontiers in Handwriting Recognition. Kolkata, India: IEEE, 2010: 737-742. [ DOI: 10.1109/ICFHR.2010.120 http://dx.doi.org/10.1109/ICFHR.2010.120 ]
Arivazhagan M, Srinivasan H, Srihari S. A statistical approach to line segmentation in handwritten documents[C ] //Pr oceedings of the SPIE 6500, Document Recognition and Retrieval XIV. San Jose, CA: SPIE, 2007: #65000T. [ DOI: 10.1117/12.704538 http://dx.doi.org/10.1117/12.704538 ]
Nikolaou N, Makridis M, Gatos B, et al. Segmentation of historical machine-printed documents using adaptive run length smoothing and skeleton segmentation paths[J]. Image and Vision Computing, 2010, 28(4):590-604.[DOI:10.1016/j.imavis.2009.09.013]
Shi Z X, Setlur S, Govindaraju V, et al. A steerable directional local profile technique for extraction of handwritten Arabic text lines[C ] //Proceedings of the 10th International Conference on Document Analysis and Recognition. Barcelona, Spain: IEEE, 2009: 176-180. [ DOI: 10.1109/ICDAR.2009.79 http://dx.doi.org/10.1109/ICDAR.2009.79 ]
Zhang X, Tan C L. Text line segmentation for handwritten documents using constrained seam carving[C ] //Proceedings of the 14th International Conference on Frontiers in Handwriting Recognition. Heraklion, Greece: IEEE, 2014: 98-103. [ DOI: 10.1109/ICFHR.2014.24 http://dx.doi.org/10.1109/ICFHR.2014.24 ]
Han X C, Yao H, Zhong G Q. Handwritten text line segmentation by spectral clustering[C ] //Proceedings of the SPIE 10225, 8th International Conferen ce on Graphic and Image Processing. Tokyo, Japan: SPIE, 2017: #102251A. [ DOI: 10.1117/12.2266982 http://dx.doi.org/10.1117/12.2266982 ]
Yadav V,Ragot N. Text extraction in document images: highlight on using corner points[C ] //Proceedings of the 12th IAPR Workshop on Document Analysis Systems. Santorini, Greece: IEEE, 2016: 281-286. [ DOI: 10.1109/DAS.2016.67 http://dx.doi.org/10.1109/DAS.2016.67 ]
Bukhari S S, Shafait F, Breuel T M. Text-line extraction using a convolution of isotropic Gaussian filter with a set of line filters[C ] //Proceedings of the 11th International Conference on Document Analysis and Recognition. Beijing, China: IEEE, 2011: 579-583. [ DOI: 10.1109/ICDAR.2011.122 http://dx.doi.org/10.1109/ICDAR.2011.122 ]
Du X J, Pan W M, Bui T D. Text line segmentation in handwritten documents using Mumford-Shah model[J]. Pattern Recognition, 2009, 42(12):3136-3145.[DOI:10.1016/j.patcog.2008.12.021]
Yin F, Liu C L. Handwritten Chinese text line segmentation by clustering with distance metric learning[J]. Pattern Recognition, 2009, 42(12):3146-3157.[DOI:10.1016/j.patcog.2008.12.013]
Vo Q N, Lee G. Dense prediction for text line segmentation in handwritten document images[C ] //Proceedings of 2016 IEEE International Conference on Image Processing. Phoenix, Arizona, USA: IEEE, 2016: 3264-3268. [ DOI: 10.1109/ICIP.2016.7532963 http://dx.doi.org/10.1109/ICIP.2016.7532963 ]
Boulid Y, Souhar A, Elkettani M Y. Detection of text lines of handwritten Arabic manuscripts using Markov decision processes[J]. International Journal of Interactive Multimedia and Artificial Intelligence, 2016, 4(1):31-36.[DOI:10.9781/ijimai.2016.416]
Cohen R, Dinstein I, El-Sana J, et al. Using scale-space anisotropic smoothing for text line extraction in historical documents[C ] //Proceedings of the 11th International Conference on Image Analysis and Recognition. Cham: Springer, 2014: 349-358. [ DOI: 10.1007/978-3-319-11758-4_38 http://dx.doi.org/10.1007/978-3-319-11758-4_38 ]
Ryu J, Koo H I, Cho N I. Language-independent text-line extraction algorithm for handwritten documents[J]. IEEE Signal Processing Letters, 2014, 21(9):1115-1119.[DOI:10.1109/LSP.2014.2325940]
Song X Y, Zhou L L, Li Z G, et al. Review on superpixel methods in image segmentation[J]. Journal of Image and Graphics, 2015, 20(5):599-608.
宋熙煜, 周利莉, 李中国, 等.图像分割中的超像素方法研究综述[J].中国图象图形学报, 2015, 20(5):599-608. [DOI:10.11834/jig.20150502]
Yu M, Hu Z Y. Higher-order Markov random fields and their applications in scene understanding[J]. Acta Automatica Sinica, 2015, 41(7):1213-1234.
余淼, 胡占义.高阶马尔科夫随机场及其在场景理解中的应用[J].自动化学报, 2015, 41(7):1213-1234. [DOI:10.16383/j.aas.2015.c140684]
Delong A, Osokin A, Isack H N, et al. Fast approximate energy minimization with label costs[J]. International Journal of Computer Vision, 2012, 96(1):1-27.[DOI:10.1007/s11263-011-0437-z]
Su T H, Zhang T W, Guan D. Corpus-based HIT-MW database for offline recognition of general-purpose Chinese handwritten text[J]. International Journal of Document Analysis and Recognition, 2007, 10:27-38.[DOI:10.1007/s10032-006-0037-6]
Liu C L, Yin F, Wang D H, et al. CASIA online and offline Chinese handwriting databases[C ] //Proceedings of 2011 International Conference on Document Analysis and Recognition. Beijing, China: IEEE, 2011: 37-41. [ DOI: 10.1109/ICDAR.2011.17 http://dx.doi.org/10.1109/ICDAR.2011.17 ]
Zhang X Y, Bengio Y, Liu C L. Online and offline handwritten Chinese character recognition:a comprehensive study and new benchmark[J]. Pattern Recognition, 2017, 61:348-360.[DOI:10.1016/j.patcog.2016.08.005]
Pratikakis I, Zagoris K, Barlas G, et al. ICFHR2016 handwritten document image binarization contest (H-DIBCO 2016)[C ] //Proceedings of the 201615th International Conference on Frontiers in Handwriting Recognition. Shenzhen, China: IEEE, 2016: 619-623. [ DOI: 10.1109/ICFHR.2016.0118 http://dx.doi.org/10.1109/ICFHR.2016.0118 ]
Jia F X, Shi C Z, He K, et al. Degraded document image binarization using structural symmetry of strokes[J]. Pattern Recognition, 2018, 74:225-240.[DOI:10.1016/j.patcog.2017.09.032]
相关作者
相关机构
京公网安备11010802024621