图神经网络与CNN融合的虹膜特征编码方法
An iris feature-encoding method by fusion of graph neural networks and convolutional neural networks
- 2024年29卷第9期 页码:2764-2779
纸质出版日期: 2024-09-16
DOI: 10.11834/jig.230688
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2024-09-16 ,
移动端阅览
孙金通, 沈文忠. 2024. 图神经网络与CNN融合的虹膜特征编码方法. 中国图象图形学报, 29(09):2764-2779
Sun Jintong, Shen Wenzhong. 2024. An iris feature-encoding method by fusion of graph neural networks and convolutional neural networks. Journal of Image and Graphics, 29(09):2764-2779
目的
2
更具可解释性的虹膜特征编码方法一直是虹膜识别中的一个关键问题,且低质量虹膜样本识别比较困难,图神经网络的发展为此类虹膜图像特征编码带来了新思路。本文提出了一种图神经网络与卷积神经网络融合的虹膜特征编码网络IrisFusionNet。
方法
2
在骨干网络前添加一个像素级增强模块以消除输入图像不确定性,并使用双分支骨干网络提取虹膜微观与宏观融合特征。训练阶段使用一个独特的联合损失函数对网络参数进行优化;推理阶段使用融合特征匹配策略进行特征匹配。
结果
2
实验结果表明:使用IrisFusionNet训练得到的特征提取器在多个公开低质量虹膜数据集上进行测试分别得到了EER(equal error rate)和FAR@FRR = 0.01%的最佳值0.27%与0.84%,并且将分离度DI(discriminating index)提升30%以上,识别准确性以及类聚性均远远领先于基于卷积神经网络和其他使用图神经网络模型的虹膜识别优秀算法。
结论
2
本文提出的IrisFusionNet应用于虹膜识别任务具有极高的可行性和优越性。
Objective
2
Iris recognition is a prevalent biometric feature in identity recognition technology owing to its inherent advantages, including stability, uniqueness, noncontact modality, and live-body authentication. The complete iris recognition workflow comprises four main steps: iris image acquisition, image preprocessing, feature encoding, and feature matching. Feature encoding serves as the core component of iris recognition algorithms. The improvement in interpretable iris feature encoding methods have become a pivotal concern in the field of iris recognition. Moreover, the recognition of low-quality iris samples, which often relies on specific parameter-dependent feature encoders, results in a poor generalization performance. The graph structure represents a data form with an irregular topological arrangement. Graph neural networks (GNNs) effectively update and aggregate features within such graph structures. The advancement of GNN led to the development of new approaches for feature encoding of these types of iris images. In this paper, a pioneering iris feature-fusion encoding network called IrisFusionNet, which integrates GNN with a convolutional neural network (CNN), is proposed. This network eradicates the need to implement complex parameter tuning steps and exhibits excellent generalization performance across various iris datasets.
Method
2
In the backbone network, the previously inserted pixel-level enhancement module alleviates local uncertainty in the input image through median filtering. In addition, global uncertainty was mitigated via Gaussian normalization. A dual-branch backbone network was proposed, where the head of the backbone network comprised a shared stack of CONV modules, and the neck was divided into two branches. The primary branch constructed a graph structure from an image using graph converter. We designed a hard graph attention network that introduces an efficient channel attention mechanism to aggregate and update features through utilization of edge-associated information within the graph structure. This step led to the extraction of microfeatures of iris textures. The auxiliary branch, on the other hand, used conventional CNN pipeline components, such as simple convolutional layers, pooling layers, and fully connected layers, to capture the macrostructural information on the iris. During the training phase, the fused features from the primary and auxiliary branches were optimized using a unique unified loss function graph triplet and additive angular margin unified loss (GTAU-Loss). The primary branch mapped iris images into a graph feature space, with the use of cosine similarity to measure semantic information in node feature vectors, L2 norm to measure the spatial relationship information within the adjacency matrix, and graph triplet loss to constrain feature distances within the feature space. The auxiliary branch applied an additional angular margin loss, which normalized the input image feature vectors and introduced an additional angular margin to constrain feature angle intervals, which improved intraclass feature compactness and interclass separation. Ultimately, a dynamic learning method based on an exponential model was used to fuse the features extracted from the primary and auxiliary branches and obtain the GTAU-Loss. The hyperparameter settings during training included the following: The optimization of network parameters involved the use of stochastic gradient descent (SGD) with a Nesterov momentum set to 0.9, an initial learning rate of 0.001, and a warm-up strategy adjusting the learning rate with a warm-up rate set to 0.1, conducted over 200 epochs. The iteration process of SGD was accelerated using NVIDIA RTX 3060 12 GB GPU devices, with 100 iterations lasting approximately one day. For feature matching concerning two distinct graph structures, the auxiliary branch calculated the cosine similarity between the output node features. Meanwhile, the primary branch applied a gate-based method and initially calculated the mean cosine similarity of all node pairs as the threshold for the gate, removed node pairs below this threshold, and retained node features above it to compute their cosine similarity. The similarity between these graph structures was represented as the weighted sum of cosine similarities from the primary and auxiliary branches. The similarity weights of the feature pairs computed using the primary and auxiliary branches were both set to 0.5. All experiments were conducted on a Windows 11 operating system, with PyTorch as the deep learning framework.
Result
2
To validate the effectiveness of integrating GNNs into the framework, this study conducted iris recognition experiments using a single-branch CNN framework and a dual-branch framework. The experimental outcomes substantiated the superior recognition performance involved in the structur
al design incorporating the GNN branch. Furthermore, to determine the optimal values for two crucial parameters, namely, the number of nearest neighbors (
k
) and the global feature dimension within the IrisFusionNet framework, we conducted detailed parameter experiments to determine their most favorable values.
k
was set to 8, and the optimal global feature dimension was 256. We compared the present method with several state-of-art (SOTA) methods in iris recognition, including CNN-based methods, such as ResNet, MobileNet, EfficientNet, ConvNext, etc., and GNN-based methods, such as dynamic graph representation. Comparative experimental results indicate that the feature extractor trained using IrisFusionNet, which was tested on three publicly, available low-quality iris datasets — CASIA-Iris-V4-Distance, CASIA-Iris-V4-Lamp, CASIA-Iris-Mobile-V1.0-S2—to achieve equal error rates of 1.06%, 0.71%, and 0.27% and false rejection rates at a false acceptance rate of 0.01% (FRR@FAR = 0.01%) of 7.49%, 4.21%, and 0.84%, respectively. In addition, the discriminant index reached 6.102, 6.574, and 8.451, which denote an improvement of over 30% compared with the baseline algorithm. The accuracy and clustering capability of iris recognition tasks using the feature extractor derived from IrisFusionNet substantially outperformed SOTA iris recognition algorithms based on convolutional neural networks and other GNN models. Furthermore, the graph structures derived from the graph transformer were visually displayed. The generated graph structures of similar iris images exhibited a high similarity, and those of dissimilar iris images presented remarkable differences. This intuitive visualization explained the excellent performance achieved in iris recognition by constructing graph structures and utilizing GNN methods.
Conclusion
2
In this paper, we proposed a feature fusion coding a method based on GNN (IrisFusionNet). The macro features of iris images were extracted using the CNN and the micro features of iris images were extracted using GNNs to obtain fusion features encompassing comprehensive texture characteristics. The experimental results indicate that our method considerably improved the accuracy and clustering of iris recognition and obtained a high feasibility and generalizability without necessitating complex parameter tuning specific to particular datasets.
虹膜特征编码图神经网络 (GNN)硬图注意力算子特征融合联合损失函数
iris feature codinggraph neural network (GNN)hard graph attention operatorsfeature fusionunified loss function
Chen Y, Wu W Q, Xu L and Guo S B. 2023. Iris-periocular features-fused non-collaborative authentication. Journal of Image and Graphics, 28(5): 1462-1476
陈英, 吴文强, 徐亮, 郭书斌. 2023. 融合虹膜—眼周特征的非协作身份认证. 中国图象图形学报, 28(5): 1462-1476 [DOI: 10.11834/jig.220649http://dx.doi.org/10.11834/jig.220649]
Daugman J. 2009. How iris recognition works//Bovik A, ed. The Essential Guide to Image Processing. 2nd ed. Burlington, USA: Academic Press: 715-739 [DOI: 10.1016/B978-0-12-374457-9.00025-1http://dx.doi.org/10.1016/B978-0-12-374457-9.00025-1]
Deng J K, Guo J, Xue N N and Zafeiriou S. 2019. ArcFace: additive angular margin loss for deep face recognition//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 4685-4694 [DOI: 10.1109/CVPR.2019.00482http://dx.doi.org/10.1109/CVPR.2019.00482]
Gangwar A and Joshi A. 2016. DeepIrisNet: deep iris representation with applications in iris recognition and cross-sensor iris recognition//Proceedings of 2016 IEEE International Conference on Image Processing (ICIP). Phoenix, USA: IEEE: 2301-2305 [DOI: 10.1109/ICIP.2016.7532769http://dx.doi.org/10.1109/ICIP.2016.7532769]
Gangwar A, Joshi A, Joshi P and Ramachandra R. 2019. DeepirisNet2: learning deep-IrisCodes from scratch for segmentation-robust visible wavelength and near infrared iris recognition [EB/OL]. [2023-09-21]. https://arxiv.org/ftp/arxiv/papers/1902/1902.05390.pdfhttps://arxiv.org/ftp/arxiv/papers/1902/1902.05390.pdf
Gao H Y and Ji S W. 2019. Graph representation learning via hard and channel-wise attention networks//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Anchorage, USA: Association for Computing Machinery: 741-749 [DOI: 10.1145/3292500.3330897http://dx.doi.org/10.1145/3292500.3330897]
Hamilton W L, Ying R and Leskovec J. 2018. Inductive representation learning on large graphs [EB/OL]. [2023-09-21]. https://arxiv.org/pdf/1706.02216.pdfhttps://arxiv.org/pdf/1706.02216.pdf
Han K, Wang Y H, Guo J Y, Tang Y H and Wu E H. 2022. Vision GNN: an image is worth graph of nodes [EB/OL]. [2023-09-21]. https://arxiv.org/pdf/2206.00272.pdfhttps://arxiv.org/pdf/2206.00272.pdf
Han Y, Wang P H, Kundu S, Ding Y and Wang Z Y. 2023. Vision HGNN: an image is more than a graph of nodes//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE: 19821-19831 [DOI: 10.1109/ICCV51070.2023.01820http://dx.doi.org/10.1109/ICCV51070.2023.01820]
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 770-778 [DOI: 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90]
Hou L, Liu J H, Yu X and Du J W. 2023. Review of graph neural networks [J/OL]. Computer Science, 1-25[2023-09-21]. https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=CAPJ&dbname=CAPJ&filename=JSJA20231008002 (https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=CAPJ&dbname=CAPJ&filename=JSJA20231008002(
侯磊, 刘金环, 于旭, 杜军威. 2023. 图神经网络研究综述[J/OL]. 计算机科学, 1-25 [2023-11-02]. https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=CAPJ&dbname=CAPJ&filename=JSJA20231008002https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=CAPJ&dbname=CAPJ&filename=JSJA20231008002
Howard A, Sandler M, Chen B, Wang W J, Chen L C, Tan M X, Chu G, Vasudevan V, Zhu Y K, Pang R M, Adam H and Le Q. 2019. Searching for MobileNetV3//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 1314-1324 [DOI: 10.1109/ICCV.2019.00140http://dx.doi.org/10.1109/ICCV.2019.00140]
Institute of Automation, Chinese Academy of Sciences. 2018. CASIA-Iris-Mobile-V1.0 database[DB/OL]. [2023-09-21]. http://biometrics.idealtest.orghttp://biometrics.idealtest.org
中国科学院自动化研究所. 2018. CASIA-Iris-Mobile-V1.0 数据库 [DB/OL]. [2023-09-21]. http://biometrics.idealtest.orghttp://biometrics.idealtest.org
Institute of Automation, Chinese Academy of Sciences. 2020. CASIA-IrisV4 database [DB/OL]. http://biometrics.idealtest.orghttp://biometrics.idealtest.org
中国科学院自动化研究所. 2020. CASIA-IrisV4 数据库 [DB/OL]. http://biometrics.idealtest.orghttp://biometrics.idealtest.org
Jia D D and Shen W Z. 2022. IrisCodeNet: iris feature coding network. Computer Engineering and Applications, 58(10): 185-192
贾丁丁, 沈文忠. 2022. IrisCodeNet: 虹膜特征编码网络. 计算机工程与应用, 58(10): 185-192 [DOI: 10.3778/j.issn.1002-8331.2011-0466http://dx.doi.org/10.3778/j.issn.1002-8331.2011-0466]
Kipf T N and Welling M. 2016. Semi-supervised classification with graph convolutional networks [EB/OL]. [2023-09-21]. https://arxiv.org/pdf/1609.02907.pdfhttps://arxiv.org/pdf/1609.02907.pdf
Li G H, Müller M, Qian G C, Delgadillo I C, Abualshour A, Thabet A and Ghanem B. 2023. DeepGCNs: making GCNs go as deep as CNNs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(6): 6923-6939 [DOI: 10.1109/TPAMI.2021.3074057http://dx.doi.org/10.1109/TPAMI.2021.3074057]
Liu Y C, Shao Z R, Teng Y Y and Hoffmann N. 2021. NAM: normalization-based attention module [EB/OL]. [2023-09-21]. https://arxiv.org/pdf/2111.12419.pdfhttps://arxiv.org/pdf/2111.12419.pdf
Liu Z, Mao H Z, Wu C Y, Feichtenhofer C, Darrell T and Xie S N. 2022. A ConvNet for the 2020s//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 11966-11976 [DOI: 10.1109/CVPR52688.2022.01167http://dx.doi.org/10.1109/CVPR52688.2022.01167]
Miyazawa K, Ito K, Aoki T, Kobayashi K and Nakajima H. 2008. An effective approach for iris recognition using phase-based image matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(10): 1741-1756 [DOI: 10.1109/TPAMI.2007.70833http://dx.doi.org/10.1109/TPAMI.2007.70833]
Monro D M, Rakshit S and Zhang D X. 2007. DCT-based iris recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(4): 586-595 [DOI: 10.1109/TPAMI.2007.1002http://dx.doi.org/10.1109/TPAMI.2007.1002]
Ren M, Wang Y L, Sun Z N and Tan T N. 2020. Dynamic graph representation for occlusion handling in biometrics//Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York, USA: AAAI: 11940-11947 [DOI: 10.1609/aaai.v34i07.6869http://dx.doi.org/10.1609/aaai.v34i07.6869]
Schroff F, Kalenichenko D and Philbin J. 2015. FaceNet: a unified embedding for face recognition and clustering//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 815-823 [DOI: 10.1109/CVPR.2015.7298682http://dx.doi.org/10.1109/CVPR.2015.7298682]
Shankar S, Garg S and Sarawagi S. 2018. Surprisingly easy hard-attention for sequence to sequence learning//Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium: Association for Computational Linguistics: 640-645 [DOI: 10.18653/v1/D18-1065http://dx.doi.org/10.18653/v1/D18-1065]
Sun Z N and Tan T N. 2009. Ordinal measures for iris recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(12): 2211-2226 [DOI: 10.1109/TPAMI.2008.240http://dx.doi.org/10.1109/TPAMI.2008.240]
Tan M X and Le Q V. 2019. EfficientNet: rethinking model scaling for convolutional neural networks//Proceedings of the 36th International Conference on Machine Learning. Long Beach, USA: PMLR: 6105-6114
Teng T, Shen W Z and Mao Y F. 2020. Multi-task iris fast location method based on cascaded neural network. Computer Engineering and Applications, 56(12): 118-124
滕童, 沈文忠, 毛云丰. 2020. 基于级联神经网络的多任务虹膜快速定位方法. 计算机工程与应用, 56(12): 118-124 [DOI: 10.3778/j.issn.1002-8331.1903-0145http://dx.doi.org/10.3778/j.issn.1002-8331.1903-0145]
Veličković P, Cucurull G, Casanova A, Romero A, Liò P and Bengio Y. 2018. Graph attention networks [EB/OL]. [2023-09-21]. https://arxiv.org/pdf/1710.10903.pdfhttps://arxiv.org/pdf/1710.10903.pdf
Wang Q L, Wu B G, Zhu P F, Li P H, Zuo W M and Hu Q H. 2020. ECA-Net: efficient channel attention for deep convolutional neural networks//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 11531-11539 [DOI: 10.1109/CVPR42600.2020.01155http://dx.doi.org/10.1109/CVPR42600.2020.01155]
Wang Y H, Zhu Y and Tan T N. 2002. Biometrics personal identification based on iris pattern. Acta Automatica Sinica, 28(1): 1-10
王蕴红, 朱勇, 谭铁牛. 2002. 基于虹膜识别的身份鉴别. 自动化学报, 28(1): 1-10
Xiao L H, Sun Z N, He R and Tan T N. 2013. Coupled feature selection for cross-sensor iris recognition//Proceedings of the 6th IEEE International Conference on Biometrics: Theory, Applications and Systems (BTAS). Arlington, USA: IEEE: 1-6 [DOI: 10.1109/BTAS.2013.6712752http://dx.doi.org/10.1109/BTAS.2013.6712752]
Yu B, Yin H T and Zhu Z X. 2021. ST-UNet: a spatio-temporal U-network for graph-structured time series modeling [EB/OL]. [2023-09-21]. https://arxiv.org/pdf/1903.05631.pdfhttps://arxiv.org/pdf/1903.05631.pdf
Yue H L, Wang H Y and Peng R Y. 2022. The research progress of graph neural network//Proceedings of the 17th Chinese Conference on Stereology and Image Analysis. Chinese Society of stereology: 99-102
岳含霖, 王浩宇, 彭瑞云. 2022. 图神经网络研究进展//第十七届中国体视学与图像分析学术会议论文集. 中国体视学学会: 99-102 [DOI: 10.26914/c.cnkihy.2022.051480http://dx.doi.org/10.26914/c.cnkihy.2022.051480]
Zhao Z J and Kumar A. 2017. Towards more accurate iris recognition using deeply learned spatially corresponding features//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 3829-3838 [DOI: 10.1109/ICCV.2017.411http://dx.doi.org/10.1109/ICCV.2017.411]
相关作者
相关机构