开放集文字识别技术

杨春; 刘畅; 方治屿; 韩铮; 刘成林; 殷绪成

doi:10.11834/jig.230018

模式识别与智能可视化 | 浏览量 : 0 下载量: 1 CSCD: 2

PDF
导出
分享
收藏
专辑

开放集文字识别技术
Open set text recognition technology
2023年28卷第6期页码：1767-1791
纸质出版日期： 2023-06-16 ，
DOI： 10.11834/jig.230018
稿件说明：

移动端阅览

杨春，刘畅，方治屿，韩铮，刘成林，殷绪成. 2023. 开放集文字识别技术. 中国图象图形学报， 28(06):1767-1791

Yang Chun， Liu Chang， Fang Zhiyu， Han Zheng， Liu Chenglin， Yin Xucheng. 2023. Open set text recognition technology. Journal of Image and Graphics， 28(06):1767-1791
杨春，刘畅，方治屿，韩铮，刘成林，殷绪成. 2023. 开放集文字识别技术. 中国图象图形学报， 28(06):1767-1791 DOI： 10.11834/jig.230018.

Yang Chun， Liu Chang， Fang Zhiyu， Han Zheng， Liu Chenglin， Yin Xucheng. 2023. Open set text recognition technology. Journal of Image and Graphics， 28(06):1767-1791 DOI： 10.11834/jig.230018.

摘要

开放环境下的模式识别与文字识别应用中，新数据、新模式和新类别不断涌现，要求算法具备应对新类别模式的能力。针对这一问题，研究者们开始聚焦开放集文字识别（open-set text recognition，OSTR）任务。该任务要求，算法在测试（推断）阶段，既能识别训练集见过的文字类别，还能够识别、拒识或发现训练集未见过的新文字。开放集文字识别逐步成为文字识别领域的研究热点之一。本文首先对开放集模式识别技术进行简要总结，然后重点介绍开放集文字识别的研究背景、任务定义、基本概念、研究重点和技术难点。同时，针对开放集文字识别三大问题（未知样本发现、新类别识别和上下文信息偏差），从方法的模型结构、特点优势和应用场景的角度对相关工作进行了综述。最后，对开放集文字识别技术的发展趋势和研究方向进行了分析展望。

Abstract

Text recognition is focused on text transcription-based image processing modeling in relevance to such domains like document digitization， content moderation， scene text translation， automation driving， scene understanding， and other related contexts. Conventional text recognition techniques are often concerned about characters-seen recognition more. However， two factors in the training set of these methods are yet to be well covered， which are novel character categories and out-of-vocabulary （OOV） samples. Newly characters-related samples are often linked with OOV-based samples. However， it may pay attention to seen characters without novel combinations or contexts. For novel character categories， internet-based environments can be mainly used to face unseen ligatures like 1） emoticons and unperceived languages， 2） scene-text recognition environments， and 3） characters from foreign and region-specific languages. For digitization profiling， the undiscovered characters may not be involved in as well. Since the heterogeneity of language format to be balanced， the linguistic statistic data （e.g.，

-gram， context， etc.） can be biased the training data gradually， which is challenged for vocabulary-high-correlated text recognition methods. The two factors are required to yield three key scientific problems that affect the costs or efficiency in open-world applications. The novel characters are oriented for the novel spotting capability， whereas characters-unseen are rejected to replace silent seen characters. Furthermore， as the popular open-set recognition problem， three scientific problems can be leaked out as mentioned below. First， the emergence of novel characters is not efficient in many cases， in which re-training upon each occurrence is costly， and an incremental learning capability need to be strengthened after that. Second， an amount of attention is received as the generalized zero-shot learning text recognition task. Third， Linguistic bias robustness is yielded by the OOV samples. Due to the character-based nature prediction， more popular methods can be used to possess the capability to handle characters-seen OOV samples to some extent. However， such capabilities are constrained to demonstrate strong vocabulary reliance because of the capacity of language models， the open-set text recognition （OSTR） task is feasible since existing tasks like zero-shot text recognition and OOV recognition can be used to model individual aspects of the problems only. This task aims to spot and recognize the novel characters， which is robust to linguistic skews. As an extension of the conventional text recognition task， the OSTR task is used to retain a decent recognition capability on seen contents. In recent years， the OSTR task has been developing intensively in the context of character recognition. The literature review is carried out on the open-set text recognition task and its related domains. It consists of such five aspects of the background， genericity， the concept， implementation， and summary. For the background， we introduce the application background of the OSTR task and analyze the specific OSTR-derived cases. For genericity， the generic open-set recognition is introduced in brief as a preliminary of the OSTR task that is less familiar to some researchers in the text recognition field. For concept， the definition of the OSTR task is introduced， followed by a discussion on its relationship with existing text recognition tasks， e.g.， conventional close-set text recognition task and the zero-shot text recognition task. Its implementation-wise， common text recognition frameworks are first introduced. For implementation， it can be recognized as derivations of such frameworks， where the derivation is based on the three key scientific problems as following： new category spotting， incremental recognition of novel classes， and linguistic bias robustness. Specifically， the new category spotting problem refers to rejecting samples that come from an absent class of a given label set. Slightly different from the generic open-set text recognition task， the given label-set is challenged in related to the training data straightfoward. Incremental recognition refers to new categories recognition in terms of the non-retrained side information of the corresponding categories. The definition is slightly different from the common zero-shot learning definition， it can be excluded some generative adversarial network （GAN）-based transductive approaches. The linguistic bias robustness holds its original definition beyond more stressed unseen characters. For each scientific problem， its solution can be covered in text recognition and other modeling-similar related fields. The evaluation is carried out and it can mainly cover the datasets and protocols used in the OSTR task and its contexts as listed： 1） multiple protocols based public available datasets， 2） commonly used metric to measure model performance， and 3） several of popular protocols， typical methods， and the performance. Here， a protocol refers to the compositions of training sets， testing sets， and evaluation metrics. For summary， the comparative analysis of the growth and technical preference are demonstrated. Finally， the potnetialss of the trends and future research directions are predicted further.

关键词

文字识别开放集模式识别开放集文字识别（OSTR）封闭集文字识别零样本文字识别

Keywords

character recognitionopen set recognitionopen-set text recognition （OSTR）close-set text recognitionzero-set text recognition

references

Almaz􀅡n J， Gordo A， Fornés A and Valveny E. 2014. Word spotting and recognition with embedded attributes. IEEE Transactions on Pattern Analysis and Machine Intelligence， 36（12）： 2552-2566 ［DOI： 10.1109/TPAMI.2014.2339814http://dx.doi.org/10.1109/TPAMI.2014.2339814］

Ao X， Zhang X Y， Yang H M， Yin F and Liu C L. 2019. Cross-modal prototype learning for zero-shot handwriting recognition//Proceedings of 2019 International Conference on Document Analysis and Recognition. Sydney， Australia： IEEE： 589-594 ［DOI： 10.1109/ICDAR.2019.00100http://dx.doi.org/10.1109/ICDAR.2019.00100］

Atienza R. 2021. Vision transformer for fast and efficient scene text recognition//Proceedings of the 16th International Conference on Document Analysis and Recognition. Lausanne， Switzerland： Springer： 319-334 ［DOI： 10.1007/978-3-030-86549-8_21http://dx.doi.org/10.1007/978-3-030-86549-8_21］

Baek J， Kim G， Lee J， Park S， Han D， Yun S， Oh S J and Lee H. 2019. What is wrong with scene text recognition model comparisons？ Dataset and model analysis//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul， Korea （South）： IEEE： 4714-4722 ［DOI： 10.1109/ICCV.2019.00481http://dx.doi.org/10.1109/ICCV.2019.00481］

Bao W T， Yu Q and Kong Y. 2022. OpenTAL： towards open set temporal action localization//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 2969-2979 ［DOI： 10.1109/CVPR52688.2022.00299http://dx.doi.org/10.1109/CVPR52688.2022.00299］

Bendale A and Boult T. 2015. Towards open world recognition//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston， USA： IEEE： 1893-1902 ［DOI： 10.1109/CVPR.2015.7298799http://dx.doi.org/10.1109/CVPR.2015.7298799］

Bertinetto L， Henriques J F， Valmadre J， Torr P H S and Vedaldi A. 2016. Learning feed-forward one-shot learners//Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona， Spain： Curran Associates Inc.： 523-531

Borisyuk F， Gordo A and Sivakumar V. 2018. Rosetta： large scale system for text detection and recognition in images//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. London， UK： ACM： 71-79 ［DOI： 10.1145/3219819.3219861http://dx.doi.org/10.1145/3219819.3219861］

Cao Z， Lu J， Cui S and Zhang C S. 2020. Zero-shot handwritten Chinese character recognition with hierarchical decomposition embedding. Pattern Recognition， 107： #107488 ［DOI： 10.1016/j.patcog.2020.107488http://dx.doi.org/10.1016/j.patcog.2020.107488］

Chanda S， Baas J， Haitink D， Hamel S， Stutzmann D and Schomaker L. 2018. Zero-shot learning based approach for medieval word recognition using deep-learned features//Proceedings of the 16th International Conference on Frontiers in Handwriting Recognition. Niagara Falls， USA： IEEE： 345-350 ［DOI： 10.1109/ICFHR-2018.2018.00067http://dx.doi.org/10.1109/ICFHR-2018.2018.00067］

Chanda S， Haitink D， Prasad P K， Baas J， Pal U and Schomaker L. 2021. Recognizing bengali word images——A zero-shot learning perspective//Proceedings of the 25th International Conference on Pattern Recognition. Milan， Italy： IEEE： 5603-5610 ［DOI： 10.1109/ICPR48806.2021.9412607http://dx.doi.org/10.1109/ICPR48806.2021.9412607］

Chen C F， Yang X S， Xu C S， Huang X H and Ma Z. 2021a. ECKPN： explicit class knowledge propagation network for transductive few-shot learning//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville， USA： IEEE： 6596-6605 ［DOI： 10.1109/cvpr46437.2021.00653http://dx.doi.org/10.1109/cvpr46437.2021.00653］

Chen G Y， Qiao L M， Shi Y M， Peng P X， Li J， Huang T J， Pu S L and Tian Y H. 2020. Learning open set network with discriminative reciprocal points//Proceedings of the 16th European Conference on Computer Vision. Glasgow， UK： Springer： 507-522 ［DOI： 10.1007/978-3-030-58580-8_30http://dx.doi.org/10.1007/978-3-030-58580-8_30］

Chen J Y， Li B and Xue X Y. 2021b. Zero-shot Chinese character recognition with stroke-level decomposition//Proceedings of the 30th International Joint Conference on Artificial Intelligence. Montreal， Canada： IJCAI.org： 615-621 ［DOI： 10.24963/ijcai.2021/85http://dx.doi.org/10.24963/ijcai.2021/85］

Chen J Y， Li B and Xue X Y. 2021c. Scene text telescope： text-focused scene image super-resolution//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville， USA： IEEE： 12021-12030 ［DOI： 10.1109/cvpr46437.2021.01185http://dx.doi.org/10.1109/cvpr46437.2021.01185］

Chen X X， Jin L W， Zhu Y Z， Luo C J and Wang T W. 2022. Text recognition in the wild： a survey. ACM Computing Surveys， 54（2）： #42 ［DOI： 10.1145/3440756http://dx.doi.org/10.1145/3440756］

Chen Z T， Fu Y W， Zhang Y D， Jiang Y G， Xue X Y and Sigal L. 2019. Multi-level semantic feature augmentation for one-shot learning. IEEE Transactions on Image Processing， 28（9）： 4594-4605 ［DOI： 10.1109/TIP.2019.2910052http://dx.doi.org/10.1109/TIP.2019.2910052］

Cheng Z Z， Xu Y L， Bai F， Niu Y， Pu S L and Zhou S G. 2018. AON： towards arbitrarily-oriented text recognition//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City， USA： IEEE： 5571-5579 ［DOI： 10.1109/cvpr.2018.00584http://dx.doi.org/10.1109/cvpr.2018.00584］

Chng C K， Liu Y L， Sun Y P， Ng C C， Luo C J， Ni Z H， Fang C M， Zhang S T， Han J Y， Ding E R， Liu J T， Karatzas D， Chan C S and Jin L W. 2019. ICDAR2019 robust reading challenge on arbitrary-shaped text——RRC-ArT//Proceedings of 2019 International Conference on Document Analysis and Recognition. Sydney， Australia： IEEE： 1571-1576 ［DOI： 10.1109/icdar.2019.00252http://dx.doi.org/10.1109/icdar.2019.00252］

Devlin J， Chang M W， Lee K and Toutanova K. 2019. BERT： pre-training of deep bidirectional transformers for language understanding//Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Minneapolis， USA： ACL： 4171-4186 ［DOI： 10.18653/v1/n19-1423http://dx.doi.org/10.18653/v1/n19-1423］

Diao X L， Shi D Q， Tang H， Wu L， Li Y Z and Xu H. 2022. REZCR： a zero-shot character recognition method via radical extraction ［EB/OL］. ［2022-08-17］. https://arxiv.org/pdf/2207.05842.pdfhttps://arxiv.org/pdf/2207.05842.pdf

Ding C B， Pang G S and Shen C H. 2022. Catching both gray and black swans： open-set supervised anomaly detection//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. New Orleans， USA： IEEE： 7378-7388 ［DOI： 10.1109/CVPR52688.2022.00724http://dx.doi.org/10.1109/CVPR52688.2022.00724］

Doan T and Kalita J. 2017. Overcoming the challenge for text classification in the open world//Proceedings of the 7th IEEE Annual Computing and Communication Workshop and Conference （CCWC）. Las Vegas， USA： IEEE： 1-7 ［DOI： 10.1109/CCWC.2017.7868366http://dx.doi.org/10.1109/CCWC.2017.7868366］

Du Y， Wei F Y， Zhang Z H， Shi M J， Gao Y and Li G Q. 2022. Learning to prompt for open-vocabulary object detection with vision-language model//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. New Orleans， USA： IEEE： 14064-14073 ［DOI： 10.1109/CVPR52688.2022.01369http://dx.doi.org/10.1109/CVPR52688.2022.01369］

Egglin T K and Feinstein A R. 1996. Context bias. A problem in diagnostic radiology. JAMA， 276（21）： 1752-1755 ［DOI： 10.1001/jama.276.21.1752http://dx.doi.org/10.1001/jama.276.21.1752］

Fang S C， Xie H T， Wang Y X， Mao Z D and Zhang Y D. 2021. Read like humans： autonomous， bidirectional and iterative language modeling for scene text recognition//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Nashville， USA： IEEE： 7094-7103 ［DOI： 10.1109/cvpr46437.2021.00702http://dx.doi.org/10.1109/cvpr46437.2021.00702］

Fei G L and Liu B. 2016. Breaking the closed world assumption in text classification//Proceedings of 2016 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. San Diego， USA： ACL： 506-514 ［DOI： 10.18653/v1/n16-1061http://dx.doi.org/10.18653/v1/n16-1061］

Fu Y W and Sigal L. 2016. Semi-supervised vocabulary-informed learning//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas， USA： IEEE： 5337-5346 ［DOI： 10.1109/CVPR.2016.576http://dx.doi.org/10.1109/CVPR.2016.576］

Fu Y W， Wang X M， Dong H Z， Jiang Y G， Wang M， Xue X Y and Sigal L. 2020. Vocabulary-informed zero-shot and open-set learning. IEEE Transactions on Pattern Analysis and Machine Intelligence， 42（12）： 3136-3152 ［DOI： 10.1109/TPAMI.2019.2922175http://dx.doi.org/10.1109/TPAMI.2019.2922175］

Fu Y W， Xiang T， Jiang Y G， Xue X Y， Sigal L and Gong S G. 2018. Recent advances in zero-shot recognition： toward data-efficient understanding of visual content. IEEE Signal Processing Magazine， 35（1）： 112-125 ［DOI： 10.1109/msp.2017.2763441http://dx.doi.org/10.1109/msp.2017.2763441］

Garcia-Bordils S， Mafla A， Biten A F， Nuriel O， Aberdam A， Mazor S， Litman R and Karatzas D. 2023. Out-of-vocabulary challenge report//Proceedings of Computer Vision —— ECCV 2022 Workshops. Tel Aviv， Israel： Springer： 359-375

Ge Z Y， Demyanov S and Garnavi R. 2017. Generative openmax for multi-class open set classification//Proceedings of 2017 British Machine Vision Conference. London， UK： BMVA Press： #42 ［DOI： 10.5244/c.31.42http://dx.doi.org/10.5244/c.31.42］

Geng C X and Chen S C. 2022. Collective decision for open set recognition. IEEE Transactions on Knowledge and Data Engineering， 34（1）： 192-204 ［DOI： 10.1109/TKDE.2020.2978199http://dx.doi.org/10.1109/TKDE.2020.2978199］

Geng C X， Huang S J and Chen S C. 2021. Recent advances in open set recognition： a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence， 43（10）： 3614-3631 ［DOI： 10.1109/TPAMI.2020.2981604http://dx.doi.org/10.1109/TPAMI.2020.2981604］

Guo X Q， Liu J， Liu T L and Yuan Y X. 2022. SimT： handling open-set noise for domain adaptive semantic segmentation//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 7022-7031 ［DOI： 10.1109/CVPR52688.2022.00690http://dx.doi.org/10.1109/CVPR52688.2022.00690］

Gupta A， Narayan S， Joseph K J， Khan S， Khan F S and Shah M. 2022. OW-DETR： open-world detection transformer//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 9225-9234 ［DOI： 10.1109/CVPR52688.2022.00902http://dx.doi.org/10.1109/CVPR52688.2022.00902］

Gupta A， Vedaldi A and Zisserman A. 2016. Synthetic data for text localisation in natural images//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas， USA： IEEE： 2315-2324 ［DOI： 10.1109/cvpr.2016.254http://dx.doi.org/10.1109/cvpr.2016.254］

Han J M， Ren Y Q， Ding J， Pan X J， Yan K and Xia G S. 2022. Expanding low-density latent regions for open-set object detection//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 9581-9590 ［DOI： 10.1109/CVPR52688.2022.00937http://dx.doi.org/10.1109/CVPR52688.2022.00937］

He M C， Liu Y L， Yang Z B， Zhang S， Luo C J， Gao F Y， Zheng Q， Wang Y P， Zhang X and Jin L W. 2018. ICPR2018 contest on robust reading for multi-type web images//Proceedings of the 24th International Conference on Pattern Recognition. Beijing， China： IEEE： 7-12 ［DOI： 10.1109/ICPR.2018.8546143http://dx.doi.org/10.1109/ICPR.2018.8546143］

He S and Schomaker L. 2018. Open set Chinese character recognition using multi-typed attributes ［EB/OL］. ［2023-01-11］. https://arxiv.org/pdf/1808.08993.pdfhttps://arxiv.org/pdf/1808.08993.pdf

Hou R B， Chang H， Ma B P， Shan S G and Chen X L. 2019. Cross attention network for few-shot classification//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver， Canada： Curran Associates Inc.： #360

Hu J S， Liu C Y， Yan Q D， Zhu X Y， Yu F L， Wu J J and Yin B. 2022. Vision-language adaptive mutual decoder for OOV-STR ［EB/OL］. ［2022-09-02］. https://arxiv.org/pdf/2209.00859.pdfhttps://arxiv.org/pdf/2209.00859.pdf

Huang G J， Luo X Y， Wang S W， Gu T L and Su K L. 2022a. Hippocampus-heuristic character recognition network for zero-shot learning in Chinese character recognition. Pattern Recognition， 130： #108818 ［DOI： 10.1016/j.patcog.2022.108818http://dx.doi.org/10.1016/j.patcog.2022.108818］

Huang S P， Wang H B， Liu Y G， Shi X S and Jin L W. 2019. OBC306： a large-scale oracle bone character recognition dataset//Proceedings of 2019 International Conference on Document Analysis and Recognition. Sydney， Australia： IEEE： 681-688 ［DOI： 10.1109/icdar.2019.00114http://dx.doi.org/10.1109/icdar.2019.00114］

Huang S Y， Ma J W， Han G X and Chang S F. 2022b. Task-adaptive negative envision for few-shot open-set recognition//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 7161-7170 ［DOI： 10.1109/CVPR52688.2022.00703http://dx.doi.org/10.1109/CVPR52688.2022.00703］

Huang Y H， Jin L W and Peng D Z. 2021. Zero-shot Chinese text recognition via matching class embedding//Proceedings of the 16th International Conference on Document Analysis and Recognition. Lausanne， Switzerland： Springer： 127-141 ［DOI： 10.1007/978-3-030-86334-0_9http://dx.doi.org/10.1007/978-3-030-86334-0_9］

Jaderberg M， Simonyan K， Vedaldi A and Zisserman A. 2014. Synthetic data and artificial neural networks for natural scene text recognition ［EB/OL］. ［2022-12-09］. https://arxiv.org/pdf/1406.2227.pdfhttps://arxiv.org/pdf/1406.2227.pdf

Jaderberg M， Simonyan K， Vedaldi A and Zisserman A. 2016. Reading text in the wild with convolutional neural networks. International Journal of Computer Vision， 116（1）： 1-20 ［DOI： 10.1007/s11263-015-0823-zhttp://dx.doi.org/10.1007/s11263-015-0823-z］

Jaderberg M， Simonyan K， Zisserman A and Kavukcuoglu K. 2015. Spatial transformer networks//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal， Canada： MIT Press： 2017-2025

Joseph K J， Khan S， Khan F S and Balasubramanian V N. 2021. Towards open world object detection//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville， USA： IEEE： 5826-5836 ［DOI： 10.1109/cvpr46437.2021.00577http://dx.doi.org/10.1109/cvpr46437.2021.00577］

Karatzas D， Gomez-Bigorda L， Nicolaou A， Ghosh S， Bagdanov A， Iwamura M， Matas J， Neumann L， Chandrasekhar V R， Lu S J， Shafait F， Uchida S and Valveny E. 2015. ICDAR 2015 competition on robust reading//Proceedings of the 13th International Conference on Document Analysis and Recognition. Tunis， Tunisia： IEEE： 1156-1160 ［DOI： 10.1109/icdar.2015.7333942http://dx.doi.org/10.1109/icdar.2015.7333942］

Karatzas D， Shafait F， Uchida S， Iwamura M， i Bigorda L G， Mestre S R， Mas J， Mota D F， Almaz􀅣n J A and de las Heras L P. 2013. ICDAR 2013 robust reading competition//Proceedings of the 12th International Conference on Document Analysis and Recognition. Washington， USA： IEEE： 1484-1493 ［DOI： 10.1109/icdar.2013.221http://dx.doi.org/10.1109/icdar.2013.221］

Kim J， Kim T， Kim S and Yoo C D. 2019. Edge-labeling graph neural network for few-shot learning//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach， USA： IEEE： 11-20 ［DOI： 10.1109/cvpr.2019.00010http://dx.doi.org/10.1109/cvpr.2019.00010］

Kumar P， Pathania K and Raman B. 2022. Zero-shot learning based cross-lingual sentiment analysis for sanskrit text with insufficient labeled data. Applied Intelligence：#6 ［DOI： 10.1007/s10489-022-04046-6http://dx.doi.org/10.1007/s10489-022-04046-6］

Li B C， Tang X， Qi X B， Chen Y H and Xiao R. 2020. Hamming OCR： a locality sensitive hashing neural network for scene text recognition ［EB/OL］. ［2020-09-23］. https://arxiv.org/pdf/2209.10874.pdfhttps://arxiv.org/pdf/2209.10874.pdf

Li H， Wang P， Shen C H and Zhang G Y. 2019a. Show， attend and read： a simple and strong baseline for irregular text recognition//Proceedings of the 33rd AAAI Conference on Artificial Intelligence， AAAI 2019， the 31st Innovative Applications of Artificial Intelligence Conference， IAAI 2019， the 9th AAAI Symposium on Educational Advances in Artificial Intelligence， EAAI 2019. Honolulu， USA： AAAI： 8610-8617 ［DOI： 10.1609/aaai.v33i01.33018610http://dx.doi.org/10.1609/aaai.v33i01.33018610］

Li H Y， Eigen D， Dodge S， Zeiler M and Wang X G. 2019b. Finding task-relevant features for few-shot learning by category traversal//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach， USA： IEEE： 1-10 ［DOI： 10.1109/cvpr.2019.00009http://dx.doi.org/10.1109/cvpr.2019.00009］

Li W B， Wang L， Xu J L， Huo J， Gao Y and Luo J B. 2019c. Revisiting local descriptor based image-to-class measure for few-shot learning//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach， USA： IEEE： 7253-7260 ［DOI： 10.1109/cvpr.2019.00743http://dx.doi.org/10.1109/cvpr.2019.00743］

Liao M H， Zhang J， Wan Z Y， Xie F M， Liang J J， Lyu P Y， Yao C and Bai X. 2019. Scene text recognition from two-dimensional perspective//Proceedings of the 33rd AAAI Conference on Artificial Intelligence， AAAI 2019， the 31st Innovative Applications of Artificial Intelligence Conference， IAAI 2019， the 9th AAAI Symposium on Educational Advances in Artificial Intelligence， EAAI 2019. Honolulu， USA： AAAI： 8714-8721 ［DOI： 10.1609/aaai.v33i01.33018714http://dx.doi.org/10.1609/aaai.v33i01.33018714］

Lin W W， Ma T， Zhang Z Q， Li X F and Xue X S. 2022. Variational autoencoder for zero-shot recognition of bai characters. Wireless Communications and Mobile Computing， 2022： #2717322 ［DOI： 10.1155/2022/2717322http://dx.doi.org/10.1155/2022/2717322］

Liu C， Yang C and Yin X C. 2022a. Open-set text recognition via character-context decoupling//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE/CVF： 4513-4522 ［DOI： 10.1109/cvpr52688.2022.00448http://dx.doi.org/10.1109/cvpr52688.2022.00448］

Liu C， Yang C， Qin H B， Zhu X B， Liu C L and Yin X C. 2023. Towards open-set text recognition via label-to-prototype learning. Pattern Recognition， 134： #109109 ［DOI： 10.1016/j.patcog.2022.109109http://dx.doi.org/10.1016/j.patcog.2022.109109］

Liu C L， Yin F， Wang D H and Wang Q F. 2011. CASIA online and offline Chinese handwriting databases//Proceedings of 2011 International Conference on Document Analysis and Recognition. Beijing， China： IEEE： 37-41 ［DOI： 10.1109/icdar.2011.17http://dx.doi.org/10.1109/icdar.2011.17］

Liu C Y， Chen X X， Luo C J， Jin L W， Xue Y and Liu Y L. 2021. Deep learning methods for scene text detection and recognition. Journal of Image and Graphics， 26（6）： 1330-1367

刘崇宇，陈晓雪，罗灿杰，金连文，薛洋，刘禹良. 2021. 自然场景文本检测与识别的深度学习方法. 中国图象图形学报， 26（6）： 1330-1367 ［DOI： 10.11834/jig.210044http://dx.doi.org/10.11834/jig.210044］

Liu R Y， Liu H， Li G， Hou H D， Yu T H and Yang T. 2022b. Contextual debiasing for visual recognition with causal mechanisms//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 12745-12755 ［DOI： 10.1109/CVPR52688.2022.01242http://dx.doi.org/10.1109/CVPR52688.2022.01242］

Liu Y L， Jin L W， Zhang S T and Zhang S. 2017. Detecting curve text in the wild： new dataset and new solution ［EB/OL］. ［2017-12-06］. https://arxiv.org/pdf/1712.02170.pdfhttps://arxiv.org/pdf/1712.02170.pdf

Lucas S M， Panaretos A， Sosa L， Tang A， Wong S， Young R， Ashida K， Nagai H， Okamoto M， Yamamoto H， Miyao H， Zhu J M， Ou W W， Wolf C， Jolion J M， Todoran L， Worring M and Lin X F. 2005. ICDAR 2003 robust reading competitions： entries， results， and future directions. International Journal of Document Analysis and Recognition （IJDAR）， 7（2/3）： 105-122 ［DOI： 10.1007/s10032-004-0134-3http://dx.doi.org/10.1007/s10032-004-0134-3］

Luo C J， Jin L W and Sun Z H. 2019. MORAN： a multi-object rectified attention network for scene text recognition. Pattern Recognition， 90： 109-118 ［DOI： 10.1016/j.patcog.2019.01.020http://dx.doi.org/10.1016/j.patcog.2019.01.020］

Ma Y Q， Bai S H， An S， Liu W， Liu A S， Zhen X T and Liu X L. 2020. Transductive relation-propagation network for few-shot learning//Proceedings of the 29th International Joint Conference on Artificial Intelligence. ［s.l.］： IJCAI.org： 804-810 ［DOI： 10.24963/ijcai.2020/112http://dx.doi.org/10.24963/ijcai.2020/112］

Manmatha R， Han C F and Riseman E M. 1996. Word spotting： a new approach to indexing handwriting//Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco， USA： IEEE： 631-637 ［DOI： 10.1109/CVPR.1996.517139http://dx.doi.org/10.1109/CVPR.1996.517139］

Mendes Júnior P R， de Souza R M， de O. Werneck R， Stein B V， Pazinato D V， de Almeida W R， Penatti O A B， da S. Torres R and Rocha A. 2017. Nearest neighbors distance ratio open-set classifier. Machine Learning， 106（3）： 359-386 ［DOI： 10.1007/s10994-016-5610-8http://dx.doi.org/10.1007/s10994-016-5610-8］

Mishra A， Alahari K and Jawahar C. 2012. Scene text recognition using higher order language priors//Proceedings of 2012 British Machine Vision Conference. Surrey， UK： BMVA Press： 127.1-127.11 ［DOI： 10.5244/C.26.127］

Mishra S， Zhu P and Saligrama V. 2022. Learning compositional representations for effective low-shot generalization ［EB/OL］. ［2022-04-17］. https://arxiv.org/pdf/2204.08090.pdfhttps://arxiv.org/pdf/2204.08090.pdf

Nayef N， Patel Y， Busta M， Chowdhury P N， Karatzas D， Khlif W， Matas J， Pal U， Burie J C， Liu C L and Ogier J M. 2019. ICDAR2019 robust reading challenge on multi-lingual scene text detection and recognition——RRC-MLT-2019//Proceedings of 2019 International Conference on Document Analysis and Recognition. Sydney， Australia： IEEE： 1582-1587 ［DOI： 10.1109/ICDAR.2019.00254http://dx.doi.org/10.1109/ICDAR.2019.00254］

Naylor A R. 2010. Known knowns， known unknowns and unknown unknowns： a 2010 update on carotid artery disease. The Surgeon， 8（2）： 79-86 ［DOI： 10.1016/j.surge.2010.01.006http://dx.doi.org/10.1016/j.surge.2010.01.006］

Neal L， Olson M， Fern X， Wong W K and Li F X. 2018. Open set learning with counterfactual images//Proceedings of the 15th European Conference on Computer Vision. Munich， Germany： Springer： 620-635 ［DOI： 10.1007/978-3-030-01231-1_38http://dx.doi.org/10.1007/978-3-030-01231-1_38］

Patel V M， Gopalan R， Li R N and Chellappa R. 2015. Visual domain adaptation： a survey of recent advances. IEEE Signal Processing Magazine， 32（3）： 53-69 ［DOI： 10.1109/msp.2014.2347059http://dx.doi.org/10.1109/msp.2014.2347059］

Phan T Q， Shivakumara P， Tian S X and Tan C L. 2013. Recognizing text with perspective distortion in natural scenes//Proceedings of 2013 IEEE International Conference on Computer Vision. Sydney， Australia： IEEE： 569-576 ［DOI： 10.1109/iccv.2013.76http://dx.doi.org/10.1109/iccv.2013.76］

Pourpanah F， Abdar M， Luo Y X， Zhou X L， Wang R， Lim C P， Wang X Z and Wu Q M J. 2022. A review of generalized zero-shot learning methods. IEEE Transactions on Pattern Analysis and Machine Intelligence：#3191696 ［DOI： 10.1109/TPAMI.2022.3191696http://dx.doi.org/10.1109/TPAMI.2022.3191696］

Prakhya S， Venkataram V and Kalita J. 2017. Open set text classification using CNNs//Proceedings of the 14th International Conference on Natural Language Processing. Kolkata， India： NLP Association of India： 466-475

Qi H， Brown M and Lowe D G. 2018. Low-shot learning with imprinted weights//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City， USA： IEEE： 5822-5830 ［DOI： 10.1109/cvpr.2018.00610http://dx.doi.org/10.1109/cvpr.2018.00610］

Qiao L M， Shi Y M， Li J， Wang Y H， Huang T J and Wang Y W. 2019. Transductive episodic-wise adaptive metric for few-shot learning//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul， Korea （South）： IEEE： 3603-3612 ［DOI： 10.1109/iccv.2019.00370http://dx.doi.org/10.1109/iccv.2019.00370］

Qiao Z， Zhou Y， Yang D B， Zhou Y C and Wang W P. 2020. SEED： semantics enhanced encoder-decoder framework for scene text recognition//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 13525-13534 ［DOI： 10.1109/cvpr42600.2020.01354http://dx.doi.org/10.1109/cvpr42600.2020.01354］

Rai A， Krishnan N C and Chanda S. 2021. Pho（SC）Net： an approach towards zero-shot word image recognition in historical documents//Proceedings of the 16th International Conference on Document Analysis and Recognition. Lausanne， Switzerland： Springer： 19-33 ［DOI： 10.1007/978-3-030-86549-8_2http://dx.doi.org/10.1007/978-3-030-86549-8_2］

Risnumawan A， Shivakumara P， Chan C S and Tan C L. 2014. A robust arbitrary text detection system for natural scene images. Expert Systems with Applications， 41（18）： 8027-8048 ［DOI： 10.1016/j.eswa.2014.07.008http://dx.doi.org/10.1016/j.eswa.2014.07.008］

Scheirer W J， de Rezende Rocha A， Sapkota A and Boult T E. 2013. Toward open set recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence， 35（7）： 1757-1772 ［DOI： 10.1109/TPAMI.2012.256http://dx.doi.org/10.1109/TPAMI.2012.256］

Scheirer W J， Jain L P and Boult T E. 2014. probability models for open set recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence， 36（11）： 2317-2324 ［DOI： 10.1109/TPAMI.2014.2321392http://dx.doi.org/10.1109/TPAMI.2014.2321392］

Scherreik M D and Rigling B D. 2016. Open set recognition for automatic target classification with rejection. IEEE Transactions on Aerospace and Electronic Systems， 52（2）： 632-642 ［DOI： 10.1109/taes.2015.150027http://dx.doi.org/10.1109/taes.2015.150027］

Shao L， Zhu F and Li X L. 2015. Transfer learning for visual categorization： a survey. IEEE Transactions on Neural Networks and Learning Systems， 26（5）： 1019-1034 ［DOI： 10.1109/TNNLS.2014.2330900http://dx.doi.org/10.1109/TNNLS.2014.2330900］

Sheng F F， Chen Z N and Xu B. 2019. NRTR： a no-recurrence sequence-to-sequence model for scene text recognition//Proceedings of 2019 International Conference on Document Analysis and Recognition. Sydney， Australia： IEEE： 781-786 ［DOI： 10.1109/icdar.2019.00130http://dx.doi.org/10.1109/icdar.2019.00130］

Shi B G， Bai X and Yao C. 2017a. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence， 39（11）： 2298-2304 ［DOI： 10.1109/TPAMI.2016.2646371http://dx.doi.org/10.1109/TPAMI.2016.2646371］

Shi B G， Yang M K， Wang X G， Lyu P Y， Yao C and Bai X. 2019. ASTER： an attentional scene text recognizer with flexible rectification. IEEE Transactions on Pattern Analysis and Machine Intelligence， 41（9）： 2035-2048 ［DOI： 10.1109/TPAMI.2018.2848939http://dx.doi.org/10.1109/TPAMI.2018.2848939］

Shi B G， Yao C， Liao M H， Yang M K， Xu P， Cui L Y， Belongie S J， Lu S J and Bai X. 2017b. ICDAR2017 competition on reading Chinese text in the wild （RCTW-17）//Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition. Kyoto， Japan： IEEE： 1429-1434 ［DOI： 10.1109/icdar.2017.233http://dx.doi.org/10.1109/icdar.2017.233］

Shu L， Xu H and Liu B. 2017. DOC： deep open classification of text documents//Proceedings of 2017 Conference on Empirical Methods in Natural Language Processing. Copenhagen， Denmark： ACL： 2911-2916 ［DOI： 10.18653/v1/d17-1314http://dx.doi.org/10.18653/v1/d17-1314］

Shu Y， Shi Y M， Wang Y W， Huang T J and Tian Y H. 2020. P-ODN： prototype-based open deep network for open set recognition. Scientific Reports， 10（1）： #7146 ［DOI： 10.1038/s41598-020-63649-6http://dx.doi.org/10.1038/s41598-020-63649-6］

Simon C， Koniusz P， Nock R and Harandi M. 2020. Adaptive subspaces for few-shot learning//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 4136-4145 ［DOI： 10.1109/cvpr42600.2020.00419http://dx.doi.org/10.1109/cvpr42600.2020.00419］

Snell J， Swersky K and Zemel R. 2017. Prototypical networks for few-shot learning//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach， USA： Curran Associates Inc.： 4080-4090

Song N， Zhang C and Lin G S. 2022. Few-shot open-set recognition using background as unknowns//Proceedings of the 30th ACM International Conference on Multimedia. Lisboa， Portugal： ACM： 5970-5979 ［DOI： 10.1145/3503161.3547933http://dx.doi.org/10.1145/3503161.3547933］

Souibgui M A， Fornés A， Kessentini Y and Megyesi B. 2022. Few shots are all you need： a progressive learning approach for low resource handwritten text recognition. Pattern Recognition Letters， 160： 43-49 ［DOI： 10.1016/j.patrec.2022.06.003http://dx.doi.org/10.1016/j.patrec.2022.06.003］

Su Y K， Sun R Z， Lin G S and Wu Q Y. 2021. Context decoupling augmentation for weakly supervised semantic segmentation//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal， Canada： IEEE： 6984-6994 ［DOI： 10.1109/iccv48922.2021.00692http://dx.doi.org/10.1109/iccv48922.2021.00692］

Sun Y P， Ni Z H， Chng C K， Liu Y L， Luo C J， Ng C C， Han J Y， Ding E R， Liu J T， Karatzas D， Chan C S and Jin L W. 2019. ICDAR 2019 competition on large-scale street view text with partial labeling —— RRC-LSVT//Proceedings of 2019 International Conference on Document Analysis and Recognition. Sydney， Australia： IEEE： 1557-1562 ［DOI： 10.1109/icdar.2019.00250http://dx.doi.org/10.1109/icdar.2019.00250］

Veit A， Matera T， Neumann L， Matas J and Belongie S. 2016. COCO-text： dataset and benchmark for text detection and recognition in natural images ［EB/OL］. ［2023-01-11］. https://arxiv.org/pdf/1601.07140.pdfhttps://arxiv.org/pdf/1601.07140.pdf

Vinyals O， Blundell C， Lillicrap T， Kavukcuoglu K and Wierstra D. 2016. Matching networks for one shot learning//Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona， Spain： Curran Associates Inc.： 3637-3645

Wan Z Y， Zhang J L， Zhang L， Luo J B and Yao C. 2020. On vocabulary reliance in scene text recognition//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 11422-11431 ［DOI： 10.1109/cvpr42600.2020.01144http://dx.doi.org/10.1109/cvpr42600.2020.01144］

Wang J D， Lan C L， Liu C， Ouyang Y D and Qin T. 2021. Generalizing to unseen domains： a survey on domain generalization//Proceedings of the 30th International Joint Conference on Artificial Intelligence. Montreal， Canada： IJCAI.org： 4627-4635 ［DOI： 10.24963/ijcai.2021/628http://dx.doi.org/10.24963/ijcai.2021/628］

Wang K and Belongie S. 2010. Word spotting in the wild//Proceedings of the 11th European Conference on Computer Vision. Heraklion， Greece： Springer： 591-604 ［DOI： 10.1007/978-3-642-15549-9_43http://dx.doi.org/10.1007/978-3-642-15549-9_43］

Wang K， Babenko B and Belongie S. 2011. End-to-end scene text recognition//Proceedings of 2011 International Conference on Computer Vision. Barcelona， Spain： IEEE： 1457-1464 ［DOI： 10.1109/iccv.2011.6126402http://dx.doi.org/10.1109/iccv.2011.6126402］

Wang T， Huang J Q， Zhang H W and Sun Q R. 2020a. Visual commonsense R-CNN//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 10757-10767 ［DOI： 10.1109/cvpr42600.2020.01077http://dx.doi.org/10.1109/cvpr42600.2020.01077］

Wang T W， Xie Z C， Li Z， Jin L W and Chen X L. 2019. Radical aggregation network for few-shot offline hand written Chinese character recognition. Pattern Recognition Letters， 125： 821-827 ［DOI： 10.1016/j.patrec.2019.08.005http://dx.doi.org/10.1016/j.patrec.2019.08.005］

Wang T W， Zhu Y Z， Jin L W， Luo C J， Chen X X， Wu Y Q， Wang Q Y and Cai M X. 2020b. Decoupled attention network for text recognition//The 34th AAAI Conference on Artificial Intelligence， AAAI 2020， the 32nd Inn+ovative Applications of Artificial Intelligence Conference， IAAI 2020， the 10th AAAI Symposium on Educational Advances in Artificial Intelligence， EAAI 2020. New York， USA： AAAI： 12216-12224 ［DOI： 10.1609/aaai.v34i07.6903http://dx.doi.org/10.1609/aaai.v34i07.6903］

Wang W C， Zhang J S， Du J， Wang Z R and Zhu Y X. 2018. DenseRAN for offline handwritten Chinese character recognition//Proceedings of the 16th International Conference on Frontiers in Handwriting Recognition. Niagara Falls， USA： IEEE： 104-109 ［DOI： 10.1109/icfhr-2018.2018.00027http://dx.doi.org/10.1109/icfhr-2018.2018.00027］

Wei X S， Song Y Z， Mac Aodha O， Wu J X， Peng Y X， Tang J H， Yang J and Belongie S. 2022. Fine-grained image analysis with deep learning： a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence， 44（12）： 8927-8948 ［DOI： 10.1109/TPAMI.2021.3126648http://dx.doi.org/10.1109/TPAMI.2021.3126648］

Weiss K， Khoshgoftaar T M and Wang D D. 2016. A survey of transfer learning. Journal of Big Data， 3（1）： #9 ［DOI： 10.1186/s40537-016-0043-6http://dx.doi.org/10.1186/s40537-016-0043-6］

Xia C Y， Yin W P， Feng Y H and Yu P. 2021. Incremental few-shot text classification with multi-round new classes： formulation， dataset and system//Proceedings of 2021 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. ［s.l.］： ACL： 1351-1360 ［DOI： 10.18653/v1/2021.naacl-main.106http://dx.doi.org/10.18653/v1/2021.naacl-main.106］

Xie Z C， Huang Y X， Zhu Y Z， Jin L W， Liu Y L and Xie L L. 2019. Aggregation cross-entropy for sequence recognition//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach， USA： IEEE： 6538-6547 ［DOI： 10.1109/cvpr.2019.00670http://dx.doi.org/10.1109/cvpr.2019.00670］

Yang J K， Zhou K Y， Li Y X and Liu Z W. 2021. Generalized out-of-distribution detection： a survey ［EB/OL］. ［2023-01-11］. https://arxiv.org/pdf/2110.11334.pdfhttps://arxiv.org/pdf/2110.11334.pdf

Yang M K， Guan Y S， Liao M H， He X， Bian K G， Bai S， Yao C and Bai X. 2019. Symmetry-constrained rectification network for scene text recognition//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul， Korea （South）： IEEE： 9146-9155 ［DOI： 10.1109/iccv.2019.00924http://dx.doi.org/10.1109/iccv.2019.00924］

Ye H J， Hu H X， Zhan D C and Sha F. 2020. Few-shot learning via embedding adaptation with set-to-set functions//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 8805-8814 ［DOI： 10.1109/cvpr42600.2020.00883http://dx.doi.org/10.1109/cvpr42600.2020.00883］

Yoshihashi R， Shao W， Kawakami R， You S， Iida M， and Naemura T. 2019. Classification-reconstruction learning for open-set recognition//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach， USA： IEEE： 4016–4025 ［DOI： 10.1109/CVPR.2019.00414http://dx.doi.org/10.1109/CVPR.2019.00414］

Yu D L， Li X， Zhang C Q， Liu T， Han J Y， Liu J T and Ding E R. 2020. Towards accurate scene text recognition with semantic reasoning networks//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 12110-12119 ［DOI： 10.1109/cvpr42600.2020.01213http://dx.doi.org/10.1109/cvpr42600.2020.01213］

Yu H Y， Chen J Y， Li B， Ma J Q， Guan M N， Xu X X， Wang X C， Qu S B and Xue X Y. 2021. Benchmarking Chinese text recognition： datasets， baselines， and an empirical study ［EB/OL］. ［2021-12-30］. https://arxiv.org/pdf/2112.15093.pdfhttps://arxiv.org/pdf/2112.15093.pdf

Yu Y， Qu W Y， Li N and Guo Z M. 2017. Open category classification by adversarial sample generation//Proceedings of the 26th International Joint Conference on Artificial Intelligence. Melbourne， Australia： IJCAI.org： 3357-3363 ［DOI： 10.24963/ijcai.2017/469http://dx.doi.org/10.24963/ijcai.2017/469］

Yuan T L， Zhu Z， Xu K， Li C J， Mu T J and Hu S M. 2019. A large Chinese text dataset in the wild. Journal of Computer Science and Technology， 34（3）： 509-521 ［DOI： 10.1007/s11390-019-1923-yhttp://dx.doi.org/10.1007/s11390-019-1923-y］

Yue Z Q， Zhang H W， Sun Q R and Hua X S. 2020. Interventional few-shot learning//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver， Canada： Curran Associates Inc.： #230

Zhang C H， Gupta A and Zisserman A. 2020a. Adaptive text recognition through visual matching//Proceedings of the 16th European Conference on Computer Vision. Glasgow， UK： Springer： 51-67 ［DOI： 10.1007/978-3-030-58517-4_4http://dx.doi.org/10.1007/978-3-030-58517-4_4］

Zhang H and Ding H H. 2021. Prototypical matching and open set rejection for zero-shot semantic segmentation//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal， Canada： IEEE： 6954-6963 ［DOI： 10.1109/ICCV48922.2021.00689http://dx.doi.org/10.1109/ICCV48922.2021.00689］

Zhang H and Patel V M. 2017. Sparse representation-based open set recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence， 39（8）： 1690-1696 ［DOI： 10.1109/TPAMI.2016.2613924http://dx.doi.org/10.1109/TPAMI.2016.2613924］

Zhang H L， Xu H and Lin T E. 2021. Deep open intent classification with adaptive decision boundary//Proceedings of the 35th AAAI Conference on Artificial Intelligence， AAAI 2021， the 33rd Conference on Innovative Applications of Artificial Intelligence， IAAI 2021， the 11th Symposium on Educational Advances in Artificial Intelligence， EAAI 2021. ［s.l.］： AAAI： 14374-14382 ［DOI： 10.1609/aaai.v35i16.17690http://dx.doi.org/10.1609/aaai.v35i16.17690］

Zhang J Q， Lertvittayakumjorn P and Guo Y K. 2019. Integrating semantic knowledge to tackle zero-shot text classification//Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Minneapolis， USA： ACL： 1031-1040 ［DOI： 10.18653/v1/n19-1108http://dx.doi.org/10.18653/v1/n19-1108］

Zhang J S， Du J and Dai L R. 2020b. Radical analysis network for learning hierarchies of Chinese characters. Pattern Recognition， 103： #107305 ［DOI： 10.1016/j.patcog.2020.107305http://dx.doi.org/10.1016/j.patcog.2020.107305］

Zhang J S， Zhu Y X， Du J and Dai L R. 2018. Radical analysis network for zero-shot learning in printed Chinese character recognition//Proceedings of 2018 IEEE International Conference on Multimedia and Expo. San Diego， USA： IEEE： 1-6 ［DOI： 10.1109/ICME.2018.8486456http://dx.doi.org/10.1109/ICME.2018.8486456］

Zhang X Y， Liu C L and Suen C Y. 2020c. Towards robust pattern recognition： a review. Proceedings of the IEEE， 108（6）： 894-922 ［DOI： 10.1109/jproc.2020.2989782http://dx.doi.org/10.1109/jproc.2020.2989782］

Zhang Y S. 2021. A survey of unsupervised domain adaptation for visual recognition ［EB/OL］. ［2021-12-13］. https://arxiv.org/pdf/2112.06745.pdfhttps://arxiv.org/pdf/2112.06745.pdf

Zhou D W， Ye H J and Zhan D C. 2021. Learning placeholders for open-set recognition//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville， USA： IEEE： 4401-4410 ［DOI： 10.1109/cvpr46437.2021.00438http://dx.doi.org/10.1109/cvpr46437.2021.00438］

Zhou Z H. 2022. Open-environment machine learning. National Science Review， 9（8）： #123 ［DOI： 10.1093/nsr/nwac123http://dx.doi.org/10.1093/nsr/nwac123］

Zu X Y， Yu H Y， Li B and Xue X Y. 2022. Chinese character recognition with augmented character profile matching//Proceedings of the 30th ACM International Conference on Multimedia. Lisboa， Portugal： ACM： 6094-6102 ［DOI： 10.1145/3503161.3547827http://dx.doi.org/10.1145/3503161.3547827］

文章被引用时，请邮件提醒。

提交

暂无数据