Survey on text analysis and recognition for multiethnic scripts
- Vol. 29, Issue 6, Pages: 1685-1713(2024)
Published: 16 June 2024
DOI: 10.11834/jig.240015
移动端阅览
浏览全部资源
扫码关注微信
Published: 16 June 2024 ,
移动端阅览
王维兰, 胡金水, 魏宏喜, 库尔班·吾布力 , 邵文苑, 毕晓君, 贺建军, 李振江, 丁凯, 金连文, 高良才. 2024. 少数民族文字文本分析与识别的研究进展. 中国图象图形学报, 29(06):1685-1713
Wang Weilan, Hu Jinshui, Wei Hongxi, Kurban Ubul, Shao Wenyuan, Bi Xiaojun, He Jianjun, Li zhenjiang, Ding Kai, Jin Lianwen, Gao Liangcai. 2024. Survey on text analysis and recognition for multiethnic scripts. Journal of Image and Graphics, 29(06):1685-1713
对于少数民族古籍的保护与传承,国家予以高度重视,并强调了对这些不可再生文化资源透彻数字化的重要性。随着文档图像分析与识别技术的不断进步,对少数民族文字的文本分析与识别研究受到广泛关注,并取得显著成就,成为人工智能应用研究的一个热点领域。然而,由于少数民族文字种类繁多、应用场景多样及数据集的稀缺性等问题,这一研究领域仍面临诸多挑战。本文旨在总结先前的工作,并为未来的研究提供支持,重点讨论了印刷体文本、联机手写、古籍文档及场景文字识别等任务,概述了国内外在少数民族文种识别领域的发展和最新成果。首先阐明了少数民族文字文本分析与识别的重要性及其价值,介绍了特定少数民族文字及其古籍文档的特征。然后,回顾了这一领域的发展历史和现状,分析并总结了传统方法的代表性成果及其应用;详细讨论了研究重点向深度神经网络模型和深度学习方法的全面转移,这一转变使得各文种的识别性能得到了显著提升。最后,基于相关分析,本文指出了在不同文种文档分析与识别中存在的精度和泛化能力等方面的不足,以及与汉文文本分析与识别的差异;面对少数民族文字文本识别领域的主要困难与挑战,展望了未来的研究趋势和技术发展目标。
China’s ethnic scripts differ in their structure types, creation periods, and regions of usage and scope. The historical documents and various literary materials written, recorded, and printed in ethnic scripts are even more voluminous, which leave an invaluable wealth for exploring the civilization and development history of different ethnic groups. Compared with mainstream languages, the study of ethnic minority scripts often faces low-resource conditions. In recent years, the protection and inheritance of the intangible cultural heritage of ethnic minorities have attracted increased attention from the country, which has great importance and application value for the protection of irreparable diverse cultural resources. By applying traditional image processing, pattern recognition, and machine learning methods, certain results have been achieved in text recognition and document recognition in Mongolian, Tibetan, Uyghur, Kazakh, Korean, and other major languages. Compared with mainstream languages such as English and Chinese, the research on the character recognition of minority languages, the analysis of document images, and the development of application systems is relatively lagging behind. Since the 21st century, the research and application of ethnic script text analysis and recognition have received extensive attention and made remarkable progress due to the continuous development and application of technologies in the field of document image analysis and recognition. They have become the research hotspots in the field of document analysis and recognition and artificial intelligence. However, a large number of problems still need to be solved in the field of minority script text and recognition research due to the large number of minority scripts, the wide range of application scenarios, and the scarcity of datasets. This study reviews the development history and recent progress in this field at home and abroad to better summarize previous works and provide support for the subsequent research. It focuses on four subtasks: printed text recognition, handwriting recognition, historical document recognition, and scene text recognition of several minority texts. It mainly includes Tibetan, Mongolian, Uighur, Yi, Manchu, and Dongba. These studies are mainly related to the following areas. 1) In the document image preprocessing stage, the system performs a series of operations on the input image, such as binarization, noise removal, skew correction, and image enhancement. The goal of preprocessing is to improve the accuracy of subsequent analysis and recognition. 2) Layout analysis, such as layout segmentation, text line segmentation, and character segmentation, helps understand the organizational structure of documents and extract useful information. 3) Text recognition is one of the core tasks of document image analysis, which identifies the text in a document through various technical approaches. This task may involve traditional methods such as text recognition based on single character classifiers, or it may include end-to-end text line recognition in deep learning methods. 4) Dataset construction involves constructing various datasets for training and evaluating algorithms, such as document image binarization datasets, layout analysis datasets, text line datasets, and character datasets. By contrast, analysis and recognition of historical documents are difficult due to the complexities of rough, degraded, and damaged historical book papers, which result in severe background noise in the document image layout, sticky text strokes, unclear handwriting, and damage. At present, a practical recognition system for historical documents is lacking. First, the importance and value of minority script text analysis and recognition are explained, and some minority script texts, especially historical documents, and their characteristics are introduced. Then, the history of the development of the field and the current state of the research are reviewed, and the representative results of the research of the traditional methods and the progress of the research of the deep learning methods are analyzed and summarized. Current research objects are expanding in depth and breadth, with processing methods comprehensively shifting to deep neural network models and deep learning methods. The recognition performance is also greatly improved, and the application scenarios are constantly expanding. One of the studies realizes effective modeling under low resources. It further proposes a unified multilingual joint modeling technology to identify multiple languages through one model, greatly reduce the overhead of hardware resources, and significantly improve the image and text recognition effect and generalization in multilingual scenarios. At present, it can recognize images and texts in 18 key languages or ethnic languages, including English, French, German, Japanese, Russian, Korean, Arabic, Uyghur, Kazakh, and Inner Mongolian. Based on relevant analyses, obvious deficiencies are observed in recognition accuracy and generalization ability, and differences with Chinese text recognition of ethnic script text recognition are found. The characteristics of the characters and documents of each language are completely different from those of Chinese characters and Chinese documents. For example, in the development of the Yi language, variant characters are particularly abundant due to various factors, and “one-to-many, many-to-one” characters and interpretations are the norm. The arbitrariness and diversity of historical Yi handwriting have brought great challenges to the recognition of historical Yi script. Moreover, the Tibetan script uses arabesque, the shape of the letters is complex, the black plum script is intertwined with each other, some strokes even span several characters before and after, and the connection between the letters is also relatively unique. Thus, the multi-style Tibetan recognition with high complexity and difficulty needs to be solved to achieve true multi-font text recognition. Finally, the main difficulties and challenges faced in the field of minority text recognition are discussed, and the future research trends and technical development goals are prospected. For example, research and application system development are conducted in combination with the characteristics of different languages, layout formats, and varying application scenarios. A certain gap still exists between the recognition of most ethnic languages and the development of Chinese recognition, especially in applications related to education, security, and people’s livelihood. This gap can be addressed by actively expanding new application directions. Opportunities for expansion are abundant, such as migrating large language models to ethnic minority scripts and text recognition and developing a unified multilingual joint modeling and application system.
少数民族文字文档分析与识别印刷体文本识别手写识别古籍文档识别场景文字识别
multiethnic scriptsdocument analysis and recognitionprint recognitionhandwriting recognitionhistorical document recognitionscene image text recognition
Baek J, Matsui Y and Aizawa K. 2021. What if we only use real datasets for scene text recognition? Toward scene text recognition with fewer labels [EB/OL]. [2024-01-25]. https://arxiv.org/pdf/2103.04400.pdfhttps://arxiv.org/pdf/2103.04400.pdf
Cai R D Z, Hua Q C R and Huang H M. 2022. Tibetan printed recognition based on syllable segmentation. Computer Engineering and Design, 43(9): 2594-2600
才让当知, 华却才让, 黄鹤鸣. 2022. 基于音节切分的藏文印刷体识别. 计算机工程与设计, 43(9): 2594-2600 [DOI: 10.16208/j.issn1000-7024.2022.09.025http://dx.doi.org/10.16208/j.issn1000-7024.2022.09.025]
Cai X J and Huang H M. 2016. Feature extraction method of off-line handwritten Tibetan character based on multiple projection. Computer Technology and Development, 26(3): 93-96
蔡晓娟, 黄鹤鸣. 2016. 基于多投影的脱机手写藏文字符特征提取方法. 计算机技术与发展, 26(3): 93-96 [DOI: 10.3969/j.issn.1673-629X.2016.03.022http://dx.doi.org/10.3969/j.issn.1673-629X.2016.03.022]
Cai Z Q and Wang W L. 2018. Online handwriting Tibetan character recognition based on two-dimensional discriminant locality alignment//Proceedings of the 1st Chinese Conference on Pattern Recognition and Computer Vision. Guangzhou, China: Springer: 88-98 [DOI: 10.1007/978-3-030- 03338- 5_8http://dx.doi.org/10.1007/978-3-030-03338-5_8]
Chen M, Wu X and Ma D J. 2006. Design of character set encoding for ancient Yi script in Guizhou. Science and Technology Market, (7): #222
陈敏, 吴勰, 马德江. 2006. 贵州古彝文字符集编码设计.科技经济市场, 7): #222 [DOI: 10.3969/j.issn.1009-3788.2006.07.216http://dx.doi.org/10.3969/j.issn.1009-3788.2006.07.216]
Chen S X, Han X, Lin X Y, Liu Y and Wang M G. 2020. MSER and CNN-based method for character detection in ancient Yi books. Journal of South China University of Technology (Natural Science Edition), 48(6): 123-133
陈善雄, 韩旭, 林小渝, 刘云, 王明贵. 2020. 基于MSER和CNN的彝文古籍文献的字符检测方法. 华南理工大学学报(自然科学版), 48(6): 123-133 [DOI: 10.12141/j.issn.1000- 565X.190812http://dx.doi.org/10.12141/j.issn.1000-565X.190812]
Chen S X, Wang X L, Han X, Liu Y and Wang M G. 2019. A recognition method of ancient Yi character based on deep learning. Journal of Zhejiang University (Science Edition), 46(3): 261-269
陈善雄, 王小龙, 韩旭, 刘云, 王明贵. 2019. 一种基于深度学习的古彝文识别方法. 浙江大学学报(理学版),46(3):261-269 [DOI:10.3785/j.issn.1008- 9497.2019.03.001http://dx.doi.org/10.3785/j.issn.1008-9497.2019.03.001]
Cui S D, Su Y L, Ren Q D E J, and Ji Y T. 2022. An end-to-end network for irregular printed Mongolian recognition. International Journal on Document Analysis and Recognition, 25(1): 41-50 [DOI: 10.1007/s10032-021- 00388-yhttp://dx.doi.org/10.1007/s10032-021-00388-y]
Da M J, Zhao J Y, Suo G J and Guo H. 2011. Online handwritten Naxi pictograph digits recognition system using coarse grid//Computer Science for Environmental Engineering and EcoInformatics. Kunming, China: Springer: 390-396 [DOI: 10.1007/978-3-642-22694-6_55http://dx.doi.org/10.1007/978-3-642-22694-6_55]
Dey A, Mitra S and Das N. 2020. Handwritten Tibetan character recognition based on ELM using modified HOG features//Proceedings of 2020 IEEE Calcutta Conference (CALCON). Kolkata, India: IEEE: 451-456 [DOI: 10.1109/ CALCON49167. 2020.9106541http://dx.doi.org/10.1109/CALCON49167.2020.9106541]
Dhondrub R, Tsering T and Tashi N. 2021. Study on a synthesis method for training data of ancient Tibetan book character recognition. Plateau Science Research, 5(3): 84-91
仁青东主, 头旦才让, 尼玛扎西. 2021. 藏文古籍文字识别训练数据的合成方法研究. 高原科学研究, 5(3): 84-91 [DOI: 10.16249/j.cnki.2096-4617.2021.03.011http://dx.doi.org/10.16249/j.cnki.2096-4617.2021.03.011]
Ding X Q, Wang Y W. 2017. Character Recognition: Principles, Methods and Practice. Beijing: Tsinghua University Press
丁晓青, 王言伟.2017.文字识别: 原理、方法和实践. 北京: 清华大学出版社
Fan D E J and Gao G L. 2016. DNN-HMM for large vocabulary Mongolian offline handwriting recognition//Proceedings of the 15th International Conference on Frontiers in Handwriting Recognition. Shenzhen, China: IEEE: 72-77 [DOI: 10.1109/ICFHR.2016.0026http://dx.doi.org/10.1109/ICFHR.2016.0026]
Fan D E J, Gao G L and Wu H J. 2018. MHW Mongolian offline handwritten dataset and its application. Journal of Chinese Information Processing, 32(1): 89-95
范道尔吉, 高光来, 武慧娟. 2018. MHW蒙古文脱机手写数据库及其应用. 中文信息学报, 32(1): 89-95 [DOI: 10.3969/ j.issn.1003-0077.2018.01.012http://dx.doi.org/10.3969/j.issn.1003-0077.2018.01.012]
Fan D E J, Gao G L and Wu H J. 2019. Sub-word based Mongolian offline handwriting recognition//Proceedings of 2019 International Conference on Document Analysis and Recognition. Sydney, Australia: Springer: 246-253 [DOI: 10.1109/ICDAR.2019.00048http://dx.doi.org/10.1109/ICDAR.2019.00048]
Gao G L, Su X D, Wei H X and Gong Y Y. 2011. Classical Mongolian words recognition in historical document//Proceedings of the 11th International Conference on Document Analysis and Recognition. Beijing, China: Springer: 692-697 [DOI: 10.1109/ICDAR.2011.145http://dx.doi.org/10.1109/ICDAR.2011.145]
Gao D G, Hou Y, Gao H M and Suolang Q Z. 2023. Study on detection and recognition of Tibetan Wumei printing multi-fonts text. Plateau Science Research, 7(1): 92-100
高定国, 侯闫, 高红梅, 索朗曲珍. 2023. 乌梅印刷多字体藏文文本的检测与识别. 高原科学研究, 7(1): 92-100 [DOI: 10.16249/j.cnki.2096-4617.2023.01.011http://dx.doi.org/10.16249/j.cnki.2096-4617.2023.01.011]
Gao L C, Li Y B, Du L, Zhang X P, Zhu Z Y, Lu N, Jin L W, Huang Y S and Tang Z. 2022. A survey on table recognition technology. Journal of Image and Graphics, 27(6): 1898-1917
高良才, 李一博, 都林, 张新鹏, 朱子仪, 卢宁, 金连文, 黄永帅, 汤帜. 2022. 表格识别技术研究进展. 中国图象图形学报, 27(6): 1898-1917 [DOI: 10.11834/jig.220152http://dx.doi.org/10.11834/jig.220152]
Gheni Z, Mahpirat, Yadikar N, Aysa Y and Ubul K. 2018. Gray level co-occurrence matrix feature weighting fusion for Uyghur signature verification. Computer Engineering and Design, 39(4): 1195-1201
祖丽皮亚·艾尼, 麦合甫热提, 努尔毕亚·亚地卡尔, 尤努斯·艾沙, 库尔班·吾布力. 2018. 灰度共生矩阵特征加权融合的维文签名鉴别. 计算机工程与设计, 39(4): 1195-1201 [DOI: 10.16208/j. issn1000-7024.2018.04.052http://dx.doi.org/10.16208/j.issn1000-7024.2018.04.052]
Guo H, Yin J H and Zhao J Y. 2012. Feature dimension reduction of NaXi pictographs characters recognition based on LDA. International Journal of Computer Science Issues, 9(6): 90-96
Guo H and Zhao J Y. 2010. Segmentation method for NaXi pictograph character recognition. Journal of Convergence Information Technology, 5(6): 87-98 [DOI: 10.4156/ jcit. vol5.issue6.9http://dx.doi.org/10.4156/jcit.vol5.issue6.9]
Guo H, Zhao J Y and Da M J. 2010a. A preprocessing method for NaXi pictograph character recognition. Journal of Convergence Information Technology, 5(2): 59-66 [DOI: 10.4156/jcit.vol5.issue2.7http://dx.doi.org/10.4156/jcit.vol5.issue2.7]
Guo H, Zhao J Y, Da M J and Li X N. 2010b. NaXi pictographs edge detection using lifting wavelet transform. Journal of Convergence Information Technology, 5(5): 203-210 [DOI: 10.4156/jcit.vol5.issue5.23http://dx.doi.org/10.4156/jcit.vol5.issue5.23]
Han X. 2020. Research and Implementation of Character Detection and Recognition in Ancient Yi Books. Chongqing: Southwest University
韩旭. 2020. 彝文古籍字符检测和识别的研究与实现. 重庆: 西南大学 [DOI: 10.27684/d.cnki.gxndx. 2020. 001060http://dx.doi.org/10.27684/d.cnki.gxndx.2020.001060]
Han Y H, Wang W L, Liu H M and Wang Y Q. 2019. A combined approach for the binarization of historical Tibetan document images. International Journal of Pattern Recognition and Artificial Intelligence, 33(14): #1954038 [DOI: 10.1142/S0218001419540387http://dx.doi.org/10.1142/S0218001419540387]
Hao Y S, Wang W L, Li J C and Lin Q. 2022. A method for bilingual Tibetan-Chinese scene image dataset synthesis and text detection. Journal of Computer-Aided Design and Computer Graphics, 34(4): 592-604
郝玉胜, 王维兰, 李金成, 林强. 2022. 藏汉双语场景图像数据集合成及文本检测方法. 计算机辅助设计与图形学学报, 34(4): 592-604 [DOI: 10.3724/SP.J.1089.2022.18954http://dx.doi.org/10.3724/SP.J.1089.2022.18954]
He X, Li H J, Zhou Y, Zheng R R and He J J. 2022. APSENet: a text line detection method of Manchu archives based on instance segmentation network. Journal of Minzu University of China(Natural Sciences Edition), 31(1): 19-27
赫欣, 李厚杰, 周瑜, 郑蕊蕊, 贺建军. 2022. APSENet: 一种基于实例分割网络的满文档案文本行检测方法. 中央民族大学学报(自然科学版), 31(1): 19-27 [DOI: 10.3969/j.issn.1005-8036.2022.01.003http://dx.doi.org/10.3969/j.issn.1005-8036.2022.01.003]
Hedayati F, Chong J K and Keutzer K. 2011. Recognition of Tibetan Wood Block Prints with generalized hidden Markov and kernelized modified quadratic distance function//Proceedings of 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data. Beijing, China: ACM: #12 [DOI: 10.1145/ 2034617.2034631http://dx.doi.org/10.1145/2034617.2034631]
Hou Y, Gao D G and Gao H M. 2023. Wujin printing multi-fonts text detection and recognition in Tibetan. Computer Engineering and Design, 44(4): 1058-1065
侯闫, 高定国, 高红梅. 2023. 乌金印刷多字体藏文的文本检测与识别. 计算机工程与设计, 44(4): 1058-1065 [DOI: 10.16208/j.issn1000-7024.2023.04.014http://dx.doi.org/10.16208/j.issn1000-7024.2023.04.014]
Hu P F, Wang W L, Li Q Q and Wang T J. 2021. Touching text line segmentation combined local baseline and connected component for Uchen Tibetan historical documents. Information Processing and Management, 58(6): #102689 [DOI: 10.1016/j.ipm.2021.102689http://dx.doi.org/10.1016/j.ipm.2021.102689]
Hua R X and Xu X L. 2019. Intelligent classification on images of Dongba ancient books. The Journal of Engineering, 2019(23): 9039-9042 [DOI: 10.1049/joe.2018.9177http://dx.doi.org/10.1049/joe.2018.9177]
Huang D, Li M, Zheng R R, Xu S and Bi J J. 2017. Synthetic data and DAG-SVM classifier for segmentation-free Manchu word recognition//Proceedings of 2017 International Conference on Computing Intelligence and Information System. Nanjing, China: IEEE: 46-50 [DOI: 10.1109/CIIS.2017.15http://dx.doi.org/10.1109/CIIS.2017.15]
Huang H M and Da F P. 2014. Sparse representation-based classification algorithm for optical Tibetan character recognition. Optik, 125(3): 1034-1037 [DOI: 10.1016/ j. ijleo.2013.07.101http://dx.doi.org/10.1016/j.ijleo.2013.07.101]
Huang H M, Da F P and Han X X. 2014. Wavelet transform and gradient direction based feature extraction method for off-line handwritten Tibetan letter recognition. Journal of Southeast University (English Edition), 30(1): 27-31 [DOI: 10.3969/j.issn.1003-7985.2014.01.006http://dx.doi.org/10.3969/j.issn.1003-7985.2014.01.006]
Huang X and Belongie S. 2017. Arbitrary style transfer in real-time with adaptive instance normalization//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 1510-1519 [DOI: 10.1109/ICCV.2017.167http://dx.doi.org/10.1109/ICCV.2017.167]
Isola P, Zhu J Y, Zhou T H and Efros A A. 2017. Image-to-image translation with conditional adversarial networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 5967-5976 [DOI: 10.1109/CVPR.2017.632http://dx.doi.org/10.1109/CVPR.2017.632]
Jia X D. 2017. Application Research of Handwritten Yi Character Recognition Technology Based on Deep Learning. Beijing: Minzu University of China
贾晓栋. 2017. 基于深度学习的手写彝文识别技术应用研究. 北京: 中央民族大学
Jiang W, Lu Z Y and Li J. 2013. Recognition of handwritten Uyghur character based on directional element feature. Microelectronics and Computer, 30(10): 97-100
姜文, 卢朝阳, 李静. 2013. 基于方向线素特征的手写体维文字符识别.微电子学与计算机, 30(10): 97-100 [DOI: 10.19304/ j.cnki.issn1000-7180.2013.10.025http://dx.doi.org/10.19304/j.cnki.issn1000-7180.2013.10.025]
Jiang Z W, Ding X Q and Peng L R. 2015. Character model optimization for segmentation-free Uyghur text line recognition. Journal of Tsinghua University (Science and Technology), 55(8): 873-877, 883
姜志威, 丁晓青, 彭良瑞. 2015. 针对无切分维吾尔文文本行识别的字符模型优化. 清华大学学报(自然科学版), 55(8): 873-877, 883 [DOI: 10.16511/j.cnki.qhdxxb.2015.08.010http://dx.doi.org/10.16511/j.cnki.qhdxxb.2015.08.010]
Kadier N, Peng L R and Halimulati. 2015. Uyghur and Arabic recognition methods based on hmm and statistical language model. Computer Applications and Software, 32(1): 171-174
努尔艾力·喀迪尔, 彭良瑞, 哈力木拉提. 2015. 一种基于HMM和统计语言模型的维吾尔文及阿拉伯文识别方法. 计算机应用与软件, 32(1): 171-174 [DOI: 10.3969/j.issn.1000-386x.2015.01.044http://dx.doi.org/10.3969/j.issn.1000-386x.2015.01.044]
Kaldan T and Vijayalakshmi A. 2021. TenzinNet for handwritten Tibetan numeral recognition. International Journal of Information Technology,13(4):1679-1682[DOI: 10.1007/ s41870-021-00711-0http://dx.doi.org/10.1007/s41870-021-00711-0]
Kang Y K, Wei H X, Zhang H and Gao G L. 2019. Woodblock-printing Mongolian words recognition by bi-LSTM with attention mechanism//Proceedings of 2019 International Conference on Document Analysis and Recognition. Sydney, Australia: Springer: 910-915 [DOI: 10.1109/ICDAR.2019.00150http://dx.doi.org/10.1109/ICDAR.2019.00150]
Kojima M, Nunomiya C, Kawamura T, Akiyama Y, Kawazoe Y and Klmura M. 1995.Recognition of printed tibetan characters by object oriented designing//Proceedings of Annual Conference Japan Society of Information and Knowledge, 3(78): 53-60 [DOI:10.2964/jsikproc.3.0_53http://dx.doi.org/10.2964/jsikproc.3.0_53]
Li J C, Hao Y S, Wang W L, Wang T J and Li Q Q. 2021a. Scene text detection based on expanding the text center region for bilingual Tibetan-Chinese. International Journal of Pattern Recognition and Artificial Intelligence, 2021, 35(13): #2153007 [DOI: 10.1142/S0218001421530074http://dx.doi.org/10.1142/S0218001421530074]
Li J C, Wang X J, Wang W L, Lin Q and Hu P F. 2021. Text line segmentation of Tibetan historical documents based on text core regions combined with expansion growth. Laser and Optoelectronics Progress, 58(2): #0210008
李金成, 王筱娟, 王维兰, 林强, 胡鹏飞. 2021. 结合文字核心区域和扩展生长的藏文古籍文本行切分. 激光与光电子学进展, 58(2): #0210008 [DOI: 10.3788/LOP202158. 0210008http://dx.doi.org/10.3788/LOP202158.0210008]
Li J M. 2019. Exploration of Manchu archives informatization. China Archives, (3): 44-45
李健民. 2019. 满文档案信息化工作探索. 中国档案, (3): 44-45
Li M. 2019. Research on Online Mongolian Handwriting Recognition System based on Deep Learning. Hohhot: Inner Mongolia University
李敏. 2019. 基于深度学习的联机蒙古文手写识别系统研究. 呼和浩特: 内蒙古大学
Li M, Zheng R R, Xu S, Fu Y and Huang D. 2018. Manchu word recognition based on convolutional neural network with spatial pyramid pooling//Proceedings of the 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics. Beijing, China: IEEE: 1-6 [DOI: 10.1109/CISP-BMEI. 2018.8633131http://dx.doi.org/10.1109/CISP-BMEI.2018.8633131]
Li P C, Zhu J D, Peng L R and Guo Y B. 2016. RNN based Uyghur text line recognition and its training strategy//2016 12th IAPR Workshop on Document Analysis Systems (DAS). Santorini, Greece: IEEE: 19-24 [DOI: 10.1109/ DAS.2016.20http://dx.doi.org/10.1109/DAS.2016.20]
Li W Y, Mahpirat, Kang W X, Aysa A and Ubul K. 2021b. Multi-lingual hybrid handwritten signature recognition based on deep residual attention network//Proceedings of the 15th Chinese Conference on Biometric Recognition. Shanghai, China: Springer: 148-156 [DOI: 10.1007/978-3-030-86608-2_17http://dx.doi.org/10.1007/978-3-030-86608-2_17]
Li W Y, Mahpirat, Xu X B, Aysa A and Ubul K. 2022a. A simple convolutional neural network for small sample multi-lingual offline handwritten signature recognition//Proceedings of the 16th Chinese Conference on Biometric Recognition. Beijing, China: Springer: 393-403 [DOI: 10.1007/978- 3-031-20233-9_40http://dx.doi.org/10.1007/978-3-031-20233-9_40]
Li Y Z, Wang Y L and Liu Z Z. 2012. Study on printed Tibetan character recognition technology. Journal of Nanjing University (Natural Sciences), 48(1): 55-62
李永忠, 王玉雷, 刘真真. 2012. 藏文印刷体字符识别技术研究. 南京大学学报(自然科学), 48(1): 55-62 [DOI: 10.13232/ j.cnki.jnju.2012.01.006http://dx.doi.org/10.13232/j.cnki.jnju.2012.01.006]
Li Z H and Gao G L. 2003. Extraction of features of Mongolian printed character recognition. Computer Technology and Development, 13(11): 117-119
李振宏, 高光来. 2003. 印刷体蒙古文文字识别中常用特征的获取. 微机发展, 13(11): 117-119 [DOI: 10.3969/j.issn.1673-629X.2003. 11.042http://dx.doi.org/10.3969/j.issn.1673-629X.2003.11.042]
Li Z H, Gao G L, Hou H X and Li W. 2003. The research of printed Mongolian character recognition. Acta Scientiarum Naturalium Universitatis Neimongol, 34(4): 454-457
李振宏, 高光来, 侯宏旭, 李伟. 2003. 印刷体蒙古文文字识别的研究. 内蒙古大学学报(自然科学版), 34(4): 454-457 [DOI: 10.3969/j.issn.1000-1638.2003.04.020http://dx.doi.org/10.3969/j.issn.1000-1638.2003.04.020]
Li Z J, Wang W L and Cai Z Q. 2019a. Historical document image binarization based on edge contrast information//Proceedings of 2019 Computer Vision Conference on Advances in Computer Vision. Las Vegas, USA: Springer: 614-628 [DOI: 10.1007/978-3-030- 17795-9_44http://dx.doi.org/10.1007/978-3-030-17795-9_44]
Li Z J, Wang W L, Chen Y and Hao Y S. 2019b. A novel method of text line segmentation for historical document image of the uchen Tibetan. Journal of Visual Communication and Image Representation, 61:23-32 [DOI:10.1016/j.jvcir. 2019.01.021http://dx.doi.org/10.1016/j.jvcir.2019.01.021]
Li Z J, Wang W L, Wang Y Q and Zhang Q X. 2022b. Character recognition of Tibetan historical document in Uchen font: dataset and bench mark. Journal of Computational Methods in Sciences and Engineering, 22(5): 1779-1794 [DOI: 10.3233/JCM-226167http://dx.doi.org/10.3233/JCM-226167]
Liu C L, Jin L W, Bai X, Li X H and Yin F. 2023. Frontiers of intelligent document analysis and recognition: review and prospects. Journal of Image and Graphics, 28(8): 2223-2252
刘成林, 金连文, 白翔, 李晓辉, 殷飞. 2023. 文档智能分析与识别前沿: 回顾与展望. 中国图象图形学报, 28(8): 2223-2252 [DOI: 10.11834/jig. 221112http://dx.doi.org/10.11834/jig.221112]
Liu F and Tashi N. 2020. Study on the extraction method on characteristics of Tibetan characters based on strokes. Plateau Science Research, 4(3): 105-110
刘芳, 尼玛扎西. 2020. 基于笔划的藏文字符特征提取方法研究. 高原科学研究, 4(3): 105-110 [DOI: 10.16249/j.cnki.2096- 4617.2020.03.014http://dx.doi.org/10.16249/j.cnki.2096-4617.2020.03.014]
Liu H M, Bi X H and Wang W L. 2019. Layout analysis of historical Tibetan documents//Proceedings of the 2nd International Conference on Artificial Intelligence and Big Data (ICAIBD). Chengdu, China: IEEE: 74-78 [DOI: 10.1109/ ICAIBD.2019.8837040http://dx.doi.org/10.1109/ICAIBD.2019.8837040]
Liu H M, Shi R M, Bi X H, Wang X Y and Wang W L. 2022. Line segmentation of Tibetan ancient books based on A* algorithm. Journal of Physics: Conference Series, 2356(1): #012046 [DOI: 10.1088/1742-6596/2356/1/012046http://dx.doi.org/10.1088/1742-6596/2356/1/012046]
Liu J, Ma L L and Wu J. 2016. Online handwritten Mongolian word recognition using MWRCNN and position maps//Proceedings of the 15th International Conference on Frontiers in Handwriting Recognition. Shenzhen, China: IEEE: 60-65 [DOI: 10.1109/ICFHR. 2016.0024http://dx.doi.org/10.1109/ICFHR.2016.0024]
Liu J, Ma L L and Wu J. 2017. Online handwritten Mongolian word recognition using a novel sliding window method with recurrent neural networks//Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition. Kyoto, Japan: IEEE: 189-194 [DOI: 10.1109/ICDAR.2017.39http://dx.doi.org/10.1109/ICDAR.2017.39]
Liu L L and Zhao H. 2012. Comparing study of two language models of online Uyghur handwritten recognition. Computer Applications and Software, 29(9): 151-153
柳玲玲, 赵晖. 2012. 联机手写维吾尔文单词识别中两种语言模型的比较研究. 计算机应用与软件, 29(9): 151-153 [DOI: 10.3969/j.issn.1000-386x.2012.09.040http://dx.doi.org/10.3969/j.issn.1000-386x.2012.09.040]
Liu S, Zhu Z X, Ma Z Q, Tong F and Xie R. 2009. Design and realization on character segmentation method for Yi language based on connected components. Journal of South-Central University for Nationalities (Natural Science Edition), 28(2): 86-89
刘赛, 朱宗晓, 马志强, 童飞, 谢蓉. 2009. 基于连通域的彝文文字切分算法的设计与实现. 中南民族大学学报(自然科学版), 28(2): 86-89 [DOI: 10.3969/j.issn.1672-4321.2009.02.021http://dx.doi.org/10.3969/j.issn.1672-4321.2009.02.021]
Liu S Q, Jin L B and Miao F. 2020. Textual restoration of occluded Tibetan document pages based on side-enhanced U-Net. Journal of Electronic Imaging, 29(6): #063006 [DOI: 10.1117/1.JEI.29.6.063006http://dx.doi.org/10.1117/1.JEI.29.6.063006]
Lu H T, Wu L, Zhou J Y, Zheng R R and He J J. 2018. Seal detection on Manchu document based on faster R-CNN and data augmentation. Journal of Dalian Minzu University, 20(5): 455-459
卢海涛, 吴磊, 周建云, 郑蕊蕊, 贺建军. 2018. 基于Faster R-CNN及数据增广的满文文档印章检测. 大连民族大学学报, 20(5): 455-459 [DOI: 10.3969/j.issn.1009-315X.2018.05.016http://dx.doi.org/10.3969/j.issn.1009-315X.2018.05.016]
Luo H N, Xu D, Yang B and Zhang H Y. 2020. Multi-scale feature fusion based Dongba character recognition//Proceedings of the 5th International Conference on Mechanical, Control and Computer Engineering (ICMCCE). Harbin, China: IEEE: 1571-1575 [DOI: 10.1109/ICMCCE51767.2020.00344http://dx.doi.org/10.1109/ICMCCE51767.2020.00344]
Luo Y L, Bi X J, Wu L C and Li X L. 2022. Dongba pictographs recognition based on improved residual learning. CAAI Transactions on Intelligent Systems, 17(1): 79-87
骆彦龙, 毕晓君, 吴立成, 李霞丽. 2022. 基于改进残差学习的东巴象形文字识别. 智能系统学报, 17(1): 79-87 [DOI: 10.11992/tis.202112009http://dx.doi.org/10.11992/tis.202112009]
Luo Y L, Sun Y W and Bi X J. 2023. Multiple attentional aggregation network for handwritten Dongba character recognition. Expert Systems with Applications, 213: #118865 [DOI: 10.1016/j.eswa.2022.118865http://dx.doi.org/10.1016/j.eswa.2022.118865]
Ma L L, Liu H D and Wu J. 2011. MRG-OHTC database for online handwritten Tibetan character recognition//Proceedings of 2011 International Conference on Document Analysis and Recognition. Beijing, China: IEEE: 207-211 [DOI: 10. 1109/ICDAR.2011.50http://dx.doi.org/10.1109/ICDAR.2011.50]
Ma L L, Liu J and Wu J. 2016. A new database for online handwritten Mongolian word recognition//Proceedings of the 23rd International Conference on Pattern Recognition. Cancun, Mexico: IEEE: 1131-1136 [DOI: 10.1109/ICPR.2016.7899788http://dx.doi.org/10.1109/ICPR.2016.7899788]
Ma L L and Wu J. 2013. A recognition system for online handwritten Tibetan characters//Graphics Recognition. New Trends and Challenges. Seoul, Korea(South): Springer: 99-107 [DOI: 10.1007/978-3-642-36824-0_10http://dx.doi.org/10.1007/978-3-642-36824-0_10]
Ma L L and Wu J. 2016. Online unconstrained handwritten Tibetan character recognition using statistical recognition method. Himalayan Linguistics,15(1):31-40 [DOI: 10. 5070/H915130066http://dx.doi.org/10.5070/H915130066]
Mo L F, Mahpirat, Zhu Y L, Mamat H and Ubul K. 2020. Off-line signature recognition based on non-downsampled contourlet transform. Computer Engineering and Design, 41(12): 3472-3478
莫龙飞, 麦合甫热提, 朱亚俐, 吾尔尼沙·买买提, 库尔班·吾布力. 2020. 基于非下采样轮廓波变换的离线签名识别. 计算机工程与设计, 41(12): 3472-3478 [DOI: 10.16208/j.issn1000-7024.2020.12.026http://dx.doi.org/10.16208/j.issn1000-7024.2020.12.026]
Ngodrup, Putsesern, Daluosanglangjie, Zhao D C, Liu F and Bianbawangdui. 2009. Study on printed Tibetan character recognition . Computer Engineering and Applications, 45(24): 165-169, 172
欧珠, 普次仁, 大罗桑朗杰, 赵栋才, 刘芳, 边巴旺堆. 2009. 印刷体藏文文字识别技术研究. 计算机工程与应用, 45(24): 165-169, 172 [DOI: 10.3778/j.issn.1002-8331.2009.24.049http://dx.doi.org/10.3778/j.issn.1002-8331.2009.24.049]
Qiao L, Li Z S, Cheng Z Z and Li X. 2023. SCID: a Chinese characters invoice-scanned dataset in relevant to key information extraction derived of visually-rich document images. Journal of Image and Graphics, 28(8): 2298-2313
乔梁, 李再升, 程战战, 李玺. 2023. SCID: 用于富含视觉信息文档图像中信息提取任务的扫描中文票据数据集. 中国图象图形学报, 28(8): 2298-2313 [DOI: 0.11834/jig. 220911http://dx.doi.org/0.11834/jig.220911]
Qin H, Li Y J, Liang Q K and Wang Y N. 2023. AsymcNet: a document images-relevant asymmetric geometry correction network. Journal of Image and Graphics, 28(8): 2314-2329
秦海, 李艺杰, 梁桥康, 王耀南. 2023. 针对文档图像的非对称式几何校正网络. 中国图象图形学报, 28(8): 2314-2329 [DOI: 0.11834/jighttp://dx.doi.org/0.11834/jig,220426]
Rexit A, Muhammat M, Xu X B, Kang W X, Aysa A and Ubul K. 2022. Multilingual handwritten signature recognition based on high-dimensional feature fusion. Information, 13(10): #496 [DOI: 10.3390/info13100496http://dx.doi.org/10.3390/info13100496]
Rowinski Z and Keutzer K. 2016. Namsel: an optical character recognition system for Tibetan text. Himalayan Linguistics, 15(1): 12-30 [DOI: 10.5070/H915129937http://dx.doi.org/10.5070/H915129937]
Shao W Y. 2021. Unheard Voices and Alternative Pasts: Deciphering Chronicles of Southwest Yi and Its Layered Ranges of Signification. Columbus: Ohio State University
Shen T, Zhuang J J, Li W S, Wang Y M, Xia Y F, Zhang Z J, Zhang X and Yang J Q. 2020. Research on recognition of Dongba script by a combination of HOG feature extraction and support vector machine. Journal of Nanjing University (Natural Science), 56(6): 870-876
申彤, 庄建军, 黎文斯, 王昀牧, 夏一飞, 张志俭, 张鑫, 杨继琼. 2020. 基于HOG特征提取和支持向量机的东巴文识别. 南京大学学报(自然科学), 56(6): 870-876 [DOI: 10.13232/j.cnki.jnju.2020.06.009http://dx.doi.org/10.13232/j.cnki.jnju.2020.06.009]
Shi B G, Bai X and Yao C. 2017. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(11): 2298-2304 [DOI: 10.1109/TPAMI.2016.2646371http://dx.doi.org/10.1109/TPAMI.2016.2646371]
Simayi W, Ibrayim M and Hamdulla A. 2019. A study of RNN based online handwritten Uyghur word recognition using different word transcriptions//Proceedings of the 11th International Conference on Simulation Tools and Techniques. Chengdu, China: Springer: 518-527 [DOI: 10.1007/ 978-3- 030-32216-8_50http://dx.doi.org/10.1007/978-3-030-32216-8_50]
Simayi W, Ibrayim M, Tursun D and Hamdulla A. 2013. Research on on-line Uyghur character recognition technology based on center distance feature//IEEE International Symposium on Signal Processing and Information Technology. Athens, Greece: IEEE: 293-298 [DOI: 10.1109/ISSPIT.2013.6781896http://dx.doi.org/10.1109/ISSPIT.2013.6781896]
Simayi W, Ibrayim M, Tursun D and Hamdulla A. 2016. A survey on the classifiers in on-line handwritten Uyghur character recognition system. International Journal of Hybrid Information Technology, 9(3): 189-198 [DOI: 10.14257/ijhit.2016.9.3.18http://dx.doi.org/10.14257/ijhit.2016.9.3.18]
Su X D, Gao G L, Wei H X and Bao F L. 2015. Enhancing the Mongolian historical document recognition system with multiple knowledge-based strategies//Proceedings of the 22nd International Conference on Neural Information Processing. Istanbul, Turkey: Springer: 536-544 [DOI: 10.1007/978-3-319-26535-3_61http://dx.doi.org/10.1007/978-3-319-26535-3_61]
Su X D, Gao G L, Wei H X and Bao F L. 2016. A knowledge-based recognition system for historical Mongolian documents. International Journal on Document Analysis and Recognition, 19(3): 221-235 [DOI: 10.1007/ s10032-016-0267-1http://dx.doi.org/10.1007/s10032-016-0267-1]
Sun S W and Wei H X. 2022. A Mongolian handwritten word images generation approach based on generative adversarial networks//Proceedings of 2022 International Joint Conference on Neural Networks. Padua, Italy: IEEE: 1-8 [DOI: 10.1109/IJCNN55064.2022.9892917http://dx.doi.org/10.1109/IJCNN55064.2022.9892917]
Tang J, Silamu W, Xu M M, Xiong L J and Wang M H. 2021. Uyghur scanning body recognition based on deep learning. Journal of Northeast Normal University (Natural Science Edition), 53(1): 71-76
汤敬, 吾守尔·斯拉木, 许苗苗, 熊黎剑, 王明辉. 2021. 基于深度学习的维吾尔文扫描体识别. 东北师大学报(自然科学版), 53(1): 71-76 [DOI: 10.16163/j.cnki.22-1123/n.2021.01.012http://dx.doi.org/10.16163/j.cnki.22-1123/n.2021.01.012]
Ubul K, Ablikim R, Yadikar N, Aysa A and Yibulayin T. 2016. Off-line Uyghur signature recognition technology based on density feature. Computer Engineering and Design, 37(8): 2200-2205
库尔班·吾布力, 热依买·阿不力克木, 努尔毕亚·亚地卡尔, 阿力木江·艾沙, 吐尔根·依布拉音. 2016. 基于密度特征的维吾尔文离线签名识别. 计算机工程与设计,37(8):2200-2205[DOI:10.16208/ j.issn1000 -7024.2016.08.041http://dx.doi.org/10.16208/j.issn1000-7024.2016.08.041]
Ubul K, Adler A, Abliz G, Yasheng M and Hamdulla A. 2012. Off-line Uyghur signature recognition based on modified grid information features//Proceedings of the 11th International Conference on Information Science, Signal Processing and their Applications (ISSPA). Montreal, Canada: IEEE: 1056-1061 [DOI: 10.1109/ISSPA.2012.6310446http://dx.doi.org/10.1109/ISSPA.2012.6310446]
Ubul K, Wang X L, Yimin A, Zhang S J and Yibulayin T. 2018. Multilingual offline handwritten signature recognition based on statistical features//Proceedings of the 13th Chinese Conference on Biometric Recognition. Urumqi, China: Springer: 727-735 [DOI: 10.1007/978-3-319-97909-0_77http://dx.doi.org/10.1007/978-3-319-97909-0_77]
Van den Oord A, Li Y Z and Vinyals O. 2019. Representation learning with contrastive predictive coding [EB/OL]. [2024-01-15]. https://arxiv.org/pdf/1807. 03748.pdfhttps://arxiv.org/pdf/1807.03748.pdf
Wang D H, Wang W L and Qian J J. 2010. 2DPCA and IMLDA method of feature extraction for online handwritten Tibetan recognition//Proceedings of 2010 International Conference on Networking and Digital Society. Wenzhou, China: IEEE: 563-566 [DOI: 10.1109/ICNDS.2010.5479269http://dx.doi.org/10.1109/ICNDS.2010.5479269]
Wang D W. 2021. Research and Application of Online Handwriting Recognition in Yi. Chongqing: Southwest University
王定旺. 2021. 彝文联机手写体识别的研究与应用. 重庆: 西南大学 [DOI: 10.27684/d.cnki. gxndx.2021.003069http://dx.doi.org/10.27684/d.cnki.gxndx.2021.003069]
Wang H and Ding X Q. 2004a. A normalization method of multi-font printed Tibetan characters. Application Research of Computers, 21(6): 41-43
王华, 丁晓青. 2004a. 一种多字体印刷藏文字符的归一化方法. 计算机应用研究, 21(6): 41-43 [DOI: 10.3969/j.issn.1001- 3695.2004.06.014http://dx.doi.org/10.3969/j.issn.1001-3695.2004.06.014]
Wang H and Ding X Q. 2004b. An algorithm for multi-font printed Tibetan character recognition. Computer Engineering, 30(13): 18-20
王华, 丁晓青. 2004b. 一种多字体印刷藏文字符识别方法. 计算机工程, 30(13): 18-20 [DOI: 10.3969/j.issn.1000-3428.2004.13.008http://dx.doi.org/10.3969/j.issn.1000-3428.2004.13.008]
Wang H Y, Wang H J and Xu X L. 2016. Recognition of Naxi Dongba pictographs based on support vector machine. Journal of Yunnan University(Natural Sciences Edition), 38(5): 730-736
王海燕, 王红军, 徐小力. 2016. 基于支持向量机的纳西东巴象形文字符识别. 云南大学学报(自然科学版), 38(5): 730-736 [DOI: 10.7540/j.ynu. 20150757http://dx.doi.org/10.7540/j.ynu.20150757]
Wang J M, Wen Y H, Li Y Q and Gao Y L. 2008. The recognition system of Old-Yi character based on the image segmentation. Journal of Yunnan Nationalities University (Natural Sciences Edition), 17(1): 76-79
王嘉梅, 文永华, 李燕青, 高雅莉. 2008. 基于图像分割的古彝文字识别系统研究. 云南民族大学学报(自然科学版), 17(1): 76-79 [DOI: 10.3969/j.issn.1672-8513.2008. 01.019http://dx.doi.org/10.3969/j.issn.1672-8513.2008.01.019]
Wang W L, Ding X Q, Chen L and Wang H. 2003. Study on printed Tibetn character recognition. Computer Engineering, 29(3): 37-38, 94
王维兰,丁晓青,陈力,王华. 2003.印刷体现代藏文识别研究.计算机工程,29(3): 37-38, 94 [DOI:10.3969/ j.issn.1003028. 2003.03. 014http://dx.doi.org/10.3969/j.issn.1003028.2003.03.014]
Wang W L, Li Z J, Cai Z Q, Lv X B, Zhaxi C and Han Y H. 2019. Online Tibetan handwriting recognition for large character set on new databases. International Journal of Pattern Recognition and Artificial Intelligence, 33(10): #1953003 [DOI: 10.1142/S0218001419530033http://dx.doi.org/10.1142/S0218001419530033]
Wang W L, Lu X B, Cai Z Q, Shen W T, Fu J and Caike Z X. 2017. Online Handwritten sample generated based on component combination for Tibetan-Sanskrit. Journal of Chinese Information Processing, 31(5): 64-73
王维兰, 卢小宝, 蔡正琦, 沈文韬, 付吉, 才科扎西. 2017. 基于部件组合的联机手写“藏文—梵文”样本生成. 中文信息学报, 31(5): 64-73 [DOI: 10.3969/j.issn.1003-0077. 2017.05.009http://dx.doi.org/10.3969/j.issn.1003-0077.2017.05.009]
Wang Y N, Huaque C R, Cairang D Z and Huan K Y. 2021. Application of Tibetan image text recognition system in android—Based on mixed attention mechanism neural network model. Journal of Qinghai Normal University (Natural Science), 37(4): 26-33
王悦凝, 华却才让, 才让当知, 环科尤. 2021. 藏文图像文本识别在安卓系统中的应用——基于混合注意力机制神经网络模型. 青海师范大学学报(自然科学版), 37(4): 26-33 [DOI: j.cnki. ssn1001-7542.2021.04.013http://dx.doi.org/j.cnki.ssn1001-7542.2021.04.013]
Wang Y Q, Wang W L and Cai Z Q. 2022. Text region extraction method for historical Tibetan document based on border detection//Proceedings Volume 12172, International Conference on Electronic Information Engineering and Computer Communication (EIECC 2021). Nanchang, China: SPIE: 65-72 [DOI: 10.1117/ 12.2634657http://dx.doi.org/10.1117/12.2634657]
Wang Y W, Ao N X, Guo R, Mamat H and Ubul K. 2022. Scene Uyghur recognition with embedded coordinate attention//Proceedings of the 3rd International Conference on Pattern Recognition and Machine Learning (PRML). Chengdu, China: IEEE: 253-260 [DOI: 10.1109/PRML56267. 2022. 9882248http://dx.doi.org/10.1109/PRML56267.2022.9882248]
Wang Z W, Lu S Y, Wang M Q, Wei X and Qi Y J. 2023. AMRE: an attention-based CRNN for Manchu word recognition on a woodblock-printed dataset//Proceedings of the 29th International Conference on Neural Information Processing. Virtual Event: Springer: 267-278 [DOI: 10.1007/978-3-031-30108-7_23http://dx.doi.org/10.1007/978-3-031-30108-7_23]
Wang Z X, Xie H T, Wang Y X and Zhang Y D. 2023. Hierarchical semantics-fused scene text detection. Journal of Image and Graphics, 28(8): 2343-2355
王紫霄, 谢洪涛, 王裕鑫, 张勇东. 2023. 层级语义融合的场景文本检测. 中国图象图形学报, 28(8): 2343-2355 [DOI: 10.11834/jig.220902http://dx.doi.org/10.11834/jig.220902]
Wei H X and Gao G L. 2006. Feature selection of Mongolian characters in the recognition of printed Mongolian characters. Acta Scientiarum Naturalium Universitatis Neimongol, 37(6): 694-697
魏宏喜, 高光来. 2006. 印刷体蒙古文字识别中蒙古文字特征的选择. 内蒙古大学学报(自然科学版), 37(6): 694-697 [DOI: 10.3969/j.issn. 1000-1638.2006.06.021http://dx.doi.org/10.3969/j.issn.1000-1638.2006.06.021]
Wei H X and Gao G L. 2007a. A method of layout analysis for Mongolian document images based on connected components. Journal of Inner Mongolia University, 38(5): 586-590
魏宏喜, 高光来. 2007a. 一种基于连通域的蒙古文文档图像版面分析方法. 内蒙古大学学报(自然科学版), 38(5): 586-590 [DOI: 10.3969/j.issn. 1000-1638.2007.05.022http://dx.doi.org/10.3969/j.issn.1000-1638.2007.05.022]
Wei H X and Gao G L. 2007b. A skew detection method of Mongolian document images. Journal of Inner Mongolia University, 38(4): 458-462
魏宏喜, 高光来. 2007b. 蒙文文档图像的倾斜检测方法. 内蒙古大学学报(自然科学版), 38(4):458-462 [DOI:10.3969/j.issn.1000-1638. 2007. 04.018http://dx.doi.org/10.3969/j.issn.1000-1638.2007.04.018]
Wei H X and Gao G L. 2019. A holistic recognition approach for woodblock-print Mongolian words based on convolutional neural network//Proceedings of 2019 IEEE International Conference on Image Processing. Taipei, China: IEEE: 2726-2730 [DOI: 10.1109/ICIP.2019. 8803226http://dx.doi.org/10.1109/ICIP.2019.8803226]
Wei H X, Liu C, Zhang H, Bao F L and Gao G L. 2019. End-to-end model for offline handwritten Mongolian word recognition//Proceedings of the 8th CCF International Conference on Natural Language Processing and Chinese Computing. Dunhuang, China: Springer: 220-230 [DOI: 10.1007/978-3-030-32236-6_19http://dx.doi.org/10.1007/978-3-030-32236-6_19]
Wei H X, Liu K X, Zhang J and Fan D E J. 2021a. Data augmentation based on CycleGAN for improving woodblock-printing Mongolian words recognition//Proceedings of the 16th International Conference on Document Analysis and Recognition. Lausanne, Switzerland: Springer: 526-537 [DOI: 10.1007/978-3- 030-86337-1_35http://dx.doi.org/10.1007/978-3-030-86337-1_35]
Wei H X, Zhang H, Zhang J and Liu K X. 2021b. Multi-task learning based traditional Mongolian words recognition//Proceedings of the 25th International Conference on Pattern Recognition. Milan, Italy: IEEE: 1275-1281 [DOI: 10.1109/ICPR48806.2021.9412326http://dx.doi.org/10.1109/ICPR48806.2021.9412326]
Wu G X, Liu X L, Jiang Z L and Hua R X. 2019. Dongba classical ancient books image classification method based on ReN-Soft plus convolution residual neural network//Proceedings of the 14th IEEE International Conference on Electronic Measurement and Instruments (ICEMI). Changsha, China: IEEE: 398-404 [DOI: 10.1109/ICEMI46757.2019. 9101450http://dx.doi.org/10.1109/ICEMI46757.2019.9101450]
Wu J J, Zhao K, Yang Z Y, Yin B, Liu C and Dai L R. 2023. End-to-end multilingual text recognition based on byte modeling//Proceedings of the 12th International Conference on Image and Graphics. Nanjing, China: Springer: 128-137 [DOI: 10.1007/978-3-031-46311-2_11http://dx.doi.org/10.1007/978-3-031-46311-2_11]
Wu Y F. 2015. Manchu Archives and Historical Exploration. Shenyang: Liaoning Minzu Press
吴元丰. 2015. 满文档案与历史探究. 沈阳: 辽宁民族出版社
Xamxidin N, Mahpirat, Yao Z X, Aysa A and Ubul K. 2022. Multilingual offline signature verification based on improved inverse discriminator network. Information, 13(6): #293 [DOI: 10.3390/info13060293http://dx.doi.org/10.3390/info13060293]
Xie Y R and Dong J E. 2021. Research on Dongba hieroglyph recognition using ResNet network. Computer Era, (1): 6-10
谢裕睿, 董建娥. 2021. 基于ResNet网络的东巴象形文字识别研究.计算机时代, (1): 6-10 [DOI: 10.16644/j.cnki. cn33- 1094/tp.2021.01.002http://dx.doi.org/10.16644/j.cnki.cn33-1094/tp.2021.01.002]
Xiong L J, Silamu W and Xu M M. 2021. Design and implementation of printed Uyghur recognition system based on Django. Journal of Zhengzhou University(Natural Science Edition), 53(3): 9-14
熊黎剑, 吾守尔·斯拉木, 许苗苗. 2021. 基于Django印刷体维吾尔文识别系统的设计与实现. 郑州大学学报(理学版), 53(3): 9-14 [DOI: 10.13705/j.issn.1671-6841.2020277http://dx.doi.org/10.13705/j.issn.1671-6841.2020277]
Xu S, Li M, Zheng R R and Michael S. 2017. Manchu character segmentation and recognition method. Journal of Discrete Mathematical Sciences and Cryptography, 20(1): 43-53 [DOI: 10.1080/09720529.2016.1177965http://dx.doi.org/10.1080/09720529.2016.1177965]
Xu X L, Jiang Z L, Wu G X, Wang H J and Wang N. 2017. Identification method of Dongba pictograph based on topological characteristic and projection method. Journal of Electronic Measurement and Instrumentation, 31(1): 150-154
徐小力, 蒋章雷, 吴国新, 王红军, 王宁. 2017. 基于拓扑特征和投影法的东巴象形文识别方法研究. 电子测量与仪器学报, 31(1): 150-154 [DOI: 10.13382/ j.jemi.2017.01.022http://dx.doi.org/10.13382/j.jemi.2017.01.022]
Xu Y M and Du P P. 2017. Offline handwritten Uighur character recognition based on grapheme analysis//Proceedings of the 8th IEEE International Conference on Software Engineering and Service Science (ICSESS). Beijing, China: IEEE: 832-835 [DOI: 10.1109/ICSESS. 2017. 8343040http://dx.doi.org/10.1109/ICSESS.2017.8343040]
Xu Y M, Lu Z Y and Li J. 2013. Handwritten Uyghur character recognition based on radical dictionary and time division direction feature. Journal of Jilin University(Engineering and Technology Edition), 43(3): 740-746
许亚美, 卢朝阳, 李静. 2013. 部件字典结合时分方向特征的手写维吾尔字符识别. 吉林大学学报(工学版), 43(3): 740-746 [DOI: 10.7964/jdxbgxb201303030http://dx.doi.org/10.7964/jdxbgxb201303030]
Xu Y M and Xue J L. 2019. Offline handwritten Uighur word recognition based on segmentation-driven and two-level DTW//Proceedings of the 2nd IEEE International Conference on Computer and Communication Engineering Technology (CCET). Beijing, China: IEEE: 182-186 [DOI: 10.1109/ CCET48361.2019.8989253http://dx.doi.org/10.1109/CCET48361.2019.8989253]
Yang C, Du J, Xue M B and Zhang J S. 2023. An encoder-decoder based generation model for online handwritten mathematical expressions. Journal of Image and Graphics, 28(8): 2356-2369
杨晨, 杜俊, 薛莫白, 张建树. 2023. 用于在线手写公式合成的编解码网络. 中国图象图形学报, 28(8): 2356-2369 [DOI: 10.11834/jig. 220894http://dx.doi.org/10.11834/jig.220894]
Yang C, Liu C, Fang Z Y, Han Z, Liu C L and Yin X C. 2023. Open set text recognition technology. Journal of Image and Graphics, 28(6): 1767-1791
杨春, 刘畅, 方治屿, 韩铮, 刘成林, 殷绪成. 2023. 开放集文字识别技术. 中国图象图形学报, 28(6): 1767-1791 [DOI: 10.11834/jig.230018http://dx.doi.org/10.11834/jig.230018]
Yang Y T and Kang H L. 2018. A novel algorithm of contour tracking and partition for Dongba hieroglyph//Proceedings of the 13th Conference on Image and Graphics Technologies and Applications. Beijing, China: Springer: 157-167 [DOI: 10.1007/978-981-13-1702-6_16http://dx.doi.org/10.1007/978-981-13-1702-6_16]
Yang Y T and Kang H L. 2019a. Research on the extracting algorithm of Dongba hieroglyphic feature curves. Journal of Graphics, 40(3): 591-599
杨玉婷, 康厚良. 2019a. 东巴象形文字特征曲线提取算法研究. 图学学报, 40(3): 591-599 [DOI: 10.11996/JG.j.2095-302X.2019030591http://dx.doi.org/10.11996/JG.j.2095-302X.2019030591]
Yang Y T and Kang H L. 2019b. Dongba hieroglyphic classification algorithm based on grid resolution. Software Guide, 18(9): 196-198
杨玉婷, 康厚良. 2019b. 基于网格分解的东巴象形文字分类算法研究. 软件导刊, 18(9): 196-198 [DOI: 10.11907/rjdk.181810http://dx.doi.org/10.11907/rjdk.181810]
Yimiti A, Liu J C and Sulaiman D. 2014. Study on character segmentation and recognition technology of Uyghur in image with complex background. Journal of Xinjiang Normal University (Natural Sciences Edition), 33(1): 65-68
阿地力·依米提, 刘吉超, 杜力坤·苏来曼. 2014. 复杂背景图像中维吾尔文字切分与识别技术的研究. 新疆师范大学学报(自然科学版), 33(1): 65-68 [DOI: 10.14100/ j.cnki.1008-9659.2014.01.015http://dx.doi.org/10.14100/j.cnki.1008-9659.2014.01.015]
Zhang C and Wang W L. 2021. Character segmentation for historical Uchen Tibetan document based on structure attributes. Laser Optoelectronics Progress, 58(20): #2010020
张策, 王维兰. 2021. 基于结构属性的乌金体藏文古籍字符切分. 激光与光电子学进展, 58(20): #2010020 [DOI: 10.3788/LOP202158.2010020http://dx.doi.org/10.3788/LOP202158.2010020]
Zhang C, Wang W L and Zhang G W. 2022a. Construction of a character dataset for historical Uchen Tibetan documents under low-resource conditions. Electronics, 11(23): #3919 [DOI: 10.3390/electronics11233919http://dx.doi.org/10.3390/electronics11233919]
Zhang C, Wang W L, Liu H M, Zhang G W and Lin Q. 2022b. Character detection and segmentation of historical Uchen Tibetan documents in complex situations. IEEE Access, 10: 25376-25391 [DOI: 10.1109/ACCESS.2022.3151886http://dx.doi.org/10.1109/ACCESS.2022.3151886]
Zhang G W, Wang W L, Zhang C, Zhao P H and Zhang M K. 2023. HUTNet: an efficient convolutional neural network for handwritten Uchen Tibetan character recognition. Big Data, 11(5): 387-398 [DOI: 10.1089/big.2021.033http://dx.doi.org/10.1089/big.2021.033]
Zhang G Y, Li J J, He R W and Wang A X. 2004. An offline recognition method of handwritten primitive Manchu characters based on strokes//Proceedings of the 9th International Workshop on Frontiers in Handwriting Recognition. Kokubunji, Japan: IEEE: 432-437 [DOI: 10.1109/IWFHR.2004.16http://dx.doi.org/10.1109/IWFHR.2004.16]
Zhang G Y, Li J J and Wang A X. 2006. A new recognition method for the handwritten Manchu character unit//Proceedings of 2006 International Conference on Machine Learning and Cybernetics. Dalian, China: IEEE: 3339-3344 [DOI: 10.1109/ICMLC.2006.258471http://dx.doi.org/10.1109/ICMLC.2006.258471]
Zhang H, Wei H X, Bao F L and Gao G L. 2017. Segmentation-free printed traditional Mongolian OCR using sequence to sequence with attention model//Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition. Kyoto, Japan: IEEE: 585-590 [DOI: 10.1109/ICDAR.2017. 101http://dx.doi.org/10.1109/ICDAR.2017.101]
Zhang H R, Chen W, Su X D, Guo H and Xu H L. 2021. An efficient local word augment approach for Mongolian handwritten script recognition//Proceedings of the 16th International Conference on Document Analysis and Recognition. Lausanne, Switzerland: Springer: 429-443 [DOI: 10.1007/978-3-030-86337-1_29http://dx.doi.org/10.1007/978-3-030-86337-1_29]
Zhang J, Xu S, He J J, Li M and Zheng R R. 2019. Segmentation and extraction method for Manchu words based on seam craving. Journal of Chinese Information Processing, 33(2): 81-88
张晶, 许爽, 贺建军, 李敏, 郑蕊蕊. 2019. 基于缝隙裁剪的满文单词分割和提取方法研究. 中文信息学报, 33(2): 81-88 [DOI: 10.3969/j.issn. 1003-0077.2019.02.011http://dx.doi.org/10.3969/j.issn.1003-0077.2019.02.011]
Zhang J X, Luo C J, Jin L W, Guo F J and Ding K. 2022. Marior: margin removal and iterative content rectification for document dewarping in the wild [EB/OL]. [2024-01-15]. https://arxiv.org/pdf/2207.11515.pdfhttps://arxiv.org/pdf/2207.11515.pdf
Zhang S J, Mahpira, Mamat H, Zhu Y L and Ubul K. 2022. Uyghur offline handwritten signature verification based on texture features. Computer Engineering and Design, 41(3): 770-776
张淑婧, 麦合甫热提, 吾尔尼沙·买买提, 朱亚俐, 库尔班·吾布力. 2020. 基于纹理特征的维吾尔文离线手写签名鉴别. 计算机工程与设计, 41(3): 770-776 [DOI: 10.16208/j.issn1000-7024.2020.03.027http://dx.doi.org/10.16208/j.issn1000-7024.2020.03.027]
Zhang X Q, Ma L L, Duan L J, Liu Z Y and Wu J. 2018. Layout analysis for historical Tibetan documents based on convolutional denoising autoencoder. Journal of Chinese Information Processing, 32(7): 67-73,81
张西群, 马龙龙, 段立娟, 刘泽宇, 吴健. 2018.基于卷积降噪自编码器的藏文历史文献版面分析方法. 中文信息学报, 32(7): 67-73, 81 [DOI:10.3969/j.issn.1003-0077.2018. 07. 009http://dx.doi.org/10.3969/j.issn.1003-0077.2018.07.009]
Zhang Y, Li Q, Shen H W, Zeng G Y, Zhou Y, Ma C, Zhang Y and Wang W P. 2023. Text-centric image analysis techniques: a crtical review. Journal of Image and Graphics, 28(8): 2253-2275
张言, 李强, 申化文, 曾港艳,周宇,马灿, 张远, 王伟平. 2023.以文字为中心的图像理解技术综述.中国图象图形学报,28(8): 2253-2275 [DOI: 10.11834/jig.220968http://dx.doi.org/10.11834/jig.220968]
Zhao P H, Wang W L, Zhang G W and Lu Y Q. 2021a. Alleviating pseudo-touching in attention U-Net-based binarization approach for the historical Tibetan document images. Neural Computing and Applications, 2021,3519: 13791-13802 [DOI: 10.1007/S00521-021-06512-7http://dx.doi.org/10.1007/S00521-021-06512-7]
Zhao P H, Wang W L, Cai Z Q, Zhang G W and Lu Y Q. 2021b. Accurate fine-grained layout analysis for the historical Tibetan document based on the instance segmentation. IEEE Access, 9: 154435-154447 [DOI: 10.1109/ACCESS. 2021.3128536http://dx.doi.org/10.1109/ACCESS.2021.3128536]
Zhao J, Li J J, Zhang G Y and Wang J. 2006a. Design and implementation of off-line handwritten document recognition system of Manchu manuscript. Pattern Recognition and Artificial Intelligence, 19(6): 801-805
赵骥, 李晶皎, 张广渊, 王杰. 2006a. 脱机手写体满文文本识别系统的设计与实现. 模式识别与人工智能, 19(6): 801-805 [DOI: 10.3969/j.issn.1003-60592006.06.020http://dx.doi.org/10.3969/j.issn.1003-60592006.06.020]
Zhao J, Li J J, Wang L J and Zhang J S. 2006b. Research on the post-processing of Manchu character recognition based on hidden Markov model. Journal of Chinese Information Processing, 20(4): 63-67
赵骥, 李晶皎, 王丽君, 张继生. 2006b. 基于HMM的满文文本识别后处理的研究. 中文信息学报, 20(4): 63-67 [DOI: 10.3969/j.issn.1003- 0077. 2006.04.009http://dx.doi.org/10.3969/j.issn.1003-0077.2006.04.009]
Zhao Q H and Wang W L. 2023. Zero-RADCE: zero-reference residual attention deep curve estimation for low-light historical Tibetan document image enhancement. Visual Communications and Image Processing, 2(1): 1-8 [DOI: 10.23977/vcip.2023.020101http://dx.doi.org/10.23977/vcip.2023.020101]
Zhao Q H, Wang W L and Yu Y Y. 2022. Retinex-LTNet: low-light historical Tibetan document image enhancement based on improved Retinex-Net//Proceedings of the 4th International Conference on Robotics, Intelligent Control and Artificial Intelligence (RICAI’22). Dongguan, China: Association for Computing Machinery: 785-791 [DOI: 10.1145/3584376.3584516http://dx.doi.org/10.1145/3584376.3584516]
Zhao Y C. 2012. Research on Manchu Archives. Shanghai: World Publishing Corporation
赵彦昌. 2012. 满文档案研究. 上海: 世界图书出版公司
Zhao Y C and Su Y Y. 2017. A review of the arrangement and research of Manchu archives in the 21st century. Manzu Minority Research, 3: 55-61, 73
赵彦昌, 苏亚云. 2017. 21世纪以来满文档案整理与研究述评. 满族研究, (3): 55-61, 73 [DOI: 10.3969/j.issn.1006-365X.2010.04.011http://dx.doi.org/10.3969/j.issn.1006-365X.2010.04.011]
Zhao Y C and Wang H J. 2010. Research on the development and utilization of Manchu archives. Manzu Minority Research, (4): 47-52
赵彦昌, 王红娟. 2010. 满文档案开发利用研究. 满族研究, (4): 47-52
Zheng R R, Li M, He J J, Bi J J and Wu B C. 2018. Segmentation-Free multi-font printed Manchu word recognition using deep convolutional features and data augmentation//Proceedings of the 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics. Beijing, China: IEEE: 1-6 [DOI: 10.1109/CISP-BMEI.2018.8633208http://dx.doi.org/10.1109/CISP-BMEI.2018.8633208]
Zheng R R, Xin S Y, Zhou Y, Liu W P, Dang J W and He J J. 2021. A K-shot Manchu recognition method for large category based on N-ary ECOC. Journal of Zhengzhou University (Natural Science Edition), 53(4): 53-60
郑蕊蕊, 辛守宇, 周瑜, 刘文鹏, 党佳伟, 贺建军. 2021. 一种基于N元ECOC的大类别K-shot满文识别方法. 郑州大学学报(理学版), 53(4): 53-60 [DOI: 10.13705/ j.issn. 1671-6841.2021178http://dx.doi.org/10.13705/j.issn.1671-6841.2021178]
Zheng Y T, Li X L, Yin Z X, Gao G and Weng Y. 2023. Multi-feature fusion based automatic reconstruction in related to Chinese ancient manuscript fragments of Dunhuang. Journal of Image and Graphics, 28(8): 2330-2342
郑玉彤, 李雪龙, 殷梓轩, 高歌, 翁彧. 2023. 多特征融合的敦煌古籍残片自动缀合. 中国图象图形学报, 28(8): 2330-2342 [DOI: 10.11834/jig.220896http://dx.doi.org/10.11834/jig.220896]
Zhi X X, Gao D G, Zhao Q J, Li S W and Qu C. 2021. Text detection in Tibetan ancient books: a benchmark//Proceedings of the 2nd IEEE International Conference on Pattern Recognition and Machine Learning (PRML). Chengdu, China: IEEE: 254-259 [DOI: 10.1109/PRML52754. 2021. 9520727http://dx.doi.org/10.1109/PRML52754.2021.9520727]
Zhou F M, Wang W L and Lin Q. 2018. A novel text line segmentation method based on contour curve tracking for Tibetan historical documents. International Journal of Pattern Recognition and Artificial Intelligence, 32(10): #1854025 [DOI: 10.1142/S0218001418540253http://dx.doi.org/10.1142/S0218001418540253]
Zhu L H and Wang J M. 2010. Off-line handwritten Yi character recognition based on the multi-classifier ensemble with combination features. Journal of Yunnan University of Nationalities (Natural Sciences Edition), 19(5): 329-333
朱龙华, 王嘉梅. 2010. 基于组合特征的多分类器集成的脱机手写体彝文字识别. 云南民族大学学报(自然科学版), 19(5): 329-333 [DOI: 10.3969/j.issn.1672-8513. 2010.05.005http://dx.doi.org/10.3969/j.issn.1672-8513.2010.05.005]
Zhu Z X and Wu X L. 2012. Principles and implementation of an off-line printed Yi character recognition system. Computer Technology and Development, 22(2): 85-88, 92
朱宗晓, 吴显礼. 2012. 脱机印刷体彝族文字识别系统的原理与实现. 计算机技术与发展, 22(2): 85-88, 92 [DOI: 10.3969/j.issn.1673-629X.2012.02.022http://dx.doi.org/10.3969/j.issn.1673-629X.2012.02.022]
相关作者
相关机构