区域注意力机制引导的双路虹膜补全
Region attention mechanism based dual human iris completion technology
- 2022年27卷第5期 页码:1669-1681
收稿日期:2021-09-08,
修回日期:2021-12-03,
录用日期:2021-12-10,
纸质出版日期:2022-05-16
DOI: 10.11834/jig.210795
移动端阅览

浏览全部资源
扫码关注微信
收稿日期:2021-09-08,
修回日期:2021-12-03,
录用日期:2021-12-10,
纸质出版日期:2022-05-16
移动端阅览
目的
2
虹膜识别是一种稳定可靠的生物识别技术,但虹膜图像的采集过程会受到多种干扰造成图像中虹膜被遮挡,比如光斑遮挡、上下眼皮遮挡等。这些遮挡的存在,一方面会导致虹膜信息缺失,直接影响虹膜识别的准确性,另一方面会影响预处理(如定位、分割)的准确性,间接影响虹膜识别的准确性。为解决上述问题,本文提出区域注意力机制引导的双路虹膜补全网络,通过遮挡区域的像素补齐,可以显著减少被遮挡区域对虹膜图像预处理和识别的影响,进而提升识别性能。
方法
2
使用基于Transformer的编码器和基于卷积神经网络(convolutional neural network
CNN)的编码器提取虹膜特征,通过融合模块将两种不同编码器提取的特征进行交互结合,并利用区域注意力机制分别处理低层和高层特征,最后利用解码器对处理后的特征进行上采样,恢复遮挡区域,生成完整图像。
结果
2
在CASIA(Institute of Automation,Chinese Academy of Sciences)虹膜数据集上对本文方法进行测试。在虹膜识别性能方面,本文方法在固定遮挡大小为64×64像素的情况下,遮挡补全结果的TAR(true accept rate)(0.1%FAR(false accept rate))为63%,而带有遮挡的图像仅为19.2%,提高了43.8%。
结论
2
本文所提出的区域注意力机制引导的双路虹膜补全网络,有效结合Transformer的全局建模能力和CNN的局部建模能力,并使用针对遮挡的区域注意力机制,实现了虹膜遮挡区域补全,进一步提高了虹膜识别的性能。
Objective
2
Human iris image recognition has achieved qualified accuracy based on most recognized databases. But
the real captured iris images are presented low-quality occlusion derived from the light spot
upper and lower eyelid
leading to the quality lossin iris recognition and segmentation. Recent development of deep learning has promoted the great progress image completion method. However
since most convolutional neural networks (CNNs) are difficult to capture global cues
iris image completion remains a challenging task in the context of the large corrupted regions and complex texture and structural patterns. Most CNNs are targeted on local features extraction with unqualified captured global cues in practice. Current transformer architecture has been introduced to visual tasks. The visual transformer harnesses complex spatial transforms and long-distance feature dependencies for global representations in terms of self-attention mechanism and multi-layer perceptron (MLP) structure. Visual transformers have their challenges to identify ignored local feature details in related to the discriminability decreases between background and foreground. The CNN-based convolution operations targets on local features extraction with unqualified captured global representations. The visual transformer based cascaded self-attention modules can capture long-distance feature dependencies with local feature loss details. We illustrate a region attention mechanism based dual iris completion network
which uses the bilateral guided aggregation layer to fuse convolutional local features with transformer-based global representations within interoperable scenario. To improve recognition capability
the impact of the occluded region on iris image pre-processing and recognition can be significantly reduced based on the missing iris information completion.
Method
2
A region attention mechanism based dual iris completion network contains a Transformer encoder and a CNN encoder. Specifically
we use the Transformer encoder and the CNN encoder to extract the global and local features of the iris image
respectively. To better utilize the extracted global and local iris images
a fusion network is adopted to preserve the global and local features of the images based on the integration of the global modeling capability of Transformer and the local modeling capability of CNN both
which improve the quality of the repaired iris images
as well as maintain the global and local consistency of the images. Furthermore
we propose a region attention module to efficiently achieve the completion of the occluded regions. Beyond the pixel-level image reconstruction constraints
an effective identity preserving constraint is also designed to ensure the identity consistency between the input and the completed image. Pytorch framework is used to implement our method and evaluate it on the CASIA(Institute of Automation
Chinese Academy of Sciences) iris dataset. We use peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) as the evaluation metrics for the generation quality
PSNR is the reversed result of comparing each pixel of an image
which can reflect the ground truth of the generated image and is an objective standard for evaluating the image. SSIM estimates the holistic similarity between two images
while iris recognition as the evaluation metric for the identity preserving quality.
Result
2
Our extended demonstration results on the CASIA iris dataset demonstrate that our method is capable to generate visually qualified iris completion results with identity preserving qualitatively and quantitatively. Furthermore
we have performed experiments on images with same type of occlusion. Images for training and testing are set to the resolution of 160×160 pixels for fair comparisons. The qualitative results have shown that the repaired results of our demonstration perform well in terms of region retention and global consistency compared to the other three methods. The quantitative comparisons are conducted in two metrics. For the repaired results of different occlusion types
our PSNR and SSIM are optimal to represent better the occluded iris images restoration and the consistency of the repaired results. To verify the effectiveness of the method in improving the accuracy of iris segmentation
we use white occlusion to simulate light spot occlusion. The segmentation results of repaired images are more accurate compared to those of the occluded images. Specifically
our method achieves 63% on true accept rate(TAR)(0.1%false accept rate(FAR))
which significantly more qualified that the baseline by 43.8% in terms of 64×64 pixels. The ablation studies are implemented to demonstrate the effectiveness of the components of our network structure.
Conclusion
2
We facilitates a region attention mechanism based dual iris completion network
which utilizes transformer and CNN to extract both the global topology and local details of iris images. A fusion network is employed to fuse the global and local features. A region attention module and identity preserving loss are also issued to guide the completion task. The extended quantitative and qualitative results demonstrate the effectiveness of our iris completion method in terms of CASIA iris dataset.
Ballester C, Bertalmio M, Caselles V, Sapiro G and Verdera J. 2001. Filling-in by joint interpolation of vector fields and gray levels. IEEE Transactions on Image Processing, 10(8): 1200-1211[DOI: 10.1109/83.935036]
Barnes C, Shechtman E, Finkelstein A and Goldman D B. 2009. PatchMatch: a randomized correspondence algorithm for structural image editing. ACM Transactions on Graphics, 28(3): #24[DOI: 10.1145/1531326.1531330]
Bertalmio M, Sapiro G, Caselles V and Ballester C. 2000. Image inpainting//Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques. New Orleans, USA: ACM: 417-424[ DOI: 10.1145/344779.344972 http://dx.doi.org/10.1145/344779.344972 ]
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A and Zagoruyko S. 2020. End-to-end object detection with transformers//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 213-229[ DOI: 10.1007/978-3-030-58452-8_13 http://dx.doi.org/10.1007/978-3-030-58452-8_13 ]
Darabi S, Shechtman E, Barnes C, Goldman D B and Sen P. 2012. Image melding: combining inconsistent images using patch-based synthesis. ACM Transactions on Graphics, 31(4): 1-10[DOI: 10.1145/2185520.2185578]
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X H, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J and Houlsby N. 2021. An image is worth 16×16 words: transformers for image recognition at scale//Proceedings of the 9th International Conference on Learning Representations. virtual: OpenReview. net
Esedoglu S and Shen J H. 2002. Digital inpainting based on the Mumford-Shah-Euler image model. European Journal of Applied Mathematics, 13(4): 353-370[DOI: 10.1017/s0956792502004904]
Esser P, Rombach R and Ommer B. 2021. Taming transformers for high-resolution image synthesis//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 12868-12878
Gangwar A and Joshi A. 2016. DeepIrisNet: deep iris representation with applications in iris recognition and cross-sensor iris recognition//Proceedings of 2016 IEEE International Conference on Image Processing. Phoenix, USA: IEEE: 2301-2305[ DOI: 10.1109/icip.2016.7532769 http://dx.doi.org/10.1109/icip.2016.7532769 ]
Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A and Bengio Y. 2014. Generative adversarial nets//Proceeding of the 27th International Conference on Neural Information Processing Systems. Montreal Canada: ACM: 2672-2680[ DOI: 10.5555/2969033.2969125 http://dx.doi.org/10.5555/2969033.2969125 ]
Guo M H, Liu Z N, Mu T J and Hu S M. 2021. Beyond self-attention: external attention using two linear layers for visual tasks[EB/OL]. [2021-05-31] . https://arxiv.org/pdf/2105.02358.pdf https://arxiv.org/pdf/2105.02358.pdf
Huang J B, Kang S B, Ahuja N and Kopf J. 2014. Image completion using planar structure guidance. ACM Transactions on Graphics, 33(4): #129[DOI: 10.1145/2601097.2601205]
Iizuka S, Simo-Serra E and Ishikawa H. 2017. Globally and locally consistent image completion. ACM Transactions on Graphics, 36(4): #107[DOI: 10.1145/3072959.3073659]
Isola P, Zhu J Y, Zhou T H and Efros A A. 2017. Image-to-image translation with conditional adversarial networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 5967-5976[ DOI: 10.1109/cvpr.2017.632 http://dx.doi.org/10.1109/cvpr.2017.632 ]
Johnson J, Alahi A and Li F F. 2016. Perceptual losses for real-time style transfer and super-resolution//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer: 694-711[ DOI: 10.1007/978-3-319-46475-6_43 http://dx.doi.org/10.1007/978-3-319-46475-6_43 ]
Kingma D P and Ba J. 2015. Adam: a method for stochastic optimization//Proceedings of the 3rd International Conference on Learning Representations. San Diego, USA: ICLR
Kingma D P and Welling M. 2014. Auto-encoding variational Bayes//Proceedings of the 2nd International Conference on Learning Representations. Banff, Canada: ICLR
Li J Y, Wang N, Zhang L F, Du B and Tao D C. 2020. Recurrent feature reasoning for image inpainting//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 7757-7765[ DOI: 10.1109/cvpr42600.2020.00778 http://dx.doi.org/10.1109/cvpr42600.2020.00778 ]
Li Y J, Liu S F, Yang J M and Yang M H. 2017. Generative face completion//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 5892-5900[ DOI: 10.1109/CVPR.2017.624 http://dx.doi.org/10.1109/CVPR.2017.624 ]
Liu H Y, Jiang B, Song Y B, Huang W and Yang C. 2020. Rethinking image inpainting via a mutual encoder-decoder with feature equalizations//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 725-741[ DOI: 10.1007/978-3-030-58536-5_43 http://dx.doi.org/10.1007/978-3-030-58536-5_43 ]
Liu K H, Wang X H, Xie Y T and Hu J Y. 2021. Edge-guided GAN: a depth image inpainting approach guided by edge information. Journal of Image and Graphics, 26(1): 186-197
刘坤华, 王雪辉, 谢玉婷, 胡坚耀. 2021. Edge-guided GAN: 边界信息引导的深度图像修复. 中国图象图形学报, 26(1): 186-197[DOI: 10.11834/jig.200509]
Miyato T, Kataoka T, Koyama M and Yoshida Y. 2018. Spectral normalization for generative adversarial networks[EB/OL]. [2021-11-22] . https://arxiv.org/pdf/1802.05957.pdf https://arxiv.org/pdf/1802.05957.pdf
Navasardyan S and Ohanyan M. 2020. Image inpainting with onion convolutions//Proceedings of the 15th Asian Conference on Computer Vision. Kyoto, Japan: Springer: 3-19[ DOI: 10.1007/978-3-030-69532-3_1 http://dx.doi.org/10.1007/978-3-030-69532-3_1 ]
Nazeri K, Ng E, Joseph T, Qureshi F Z and Ebrahimi M. 2019. EdgeConnect: structure guided image inpainting using edge prediction//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision Workshop. Seoul, Korea(South): IEEE: 3265-3274[ DOI: 10.1109/ICCVW.2019.00408 http://dx.doi.org/10.1109/ICCVW.2019.00408 ]
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan C, Killeen T, Lin Z M, Gimelshein N, Antiga L, Desmaison A, Köpf A, Yang E Z, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J J and Chintala S. 2019. Pytorch: an imperative style, high-performance deep learning library//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver, Canada: NeurIPS: 8026-8037[ DOI: 10.5555/3454287.3455008 http://dx.doi.org/10.5555/3454287.3455008 ]
Pathak D, Krähenbühl P, Donahue J, Darrell T and Efros A A. 2016. Context encoders: feature learning by inpainting//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 2536-2544[ DOI: 10.1109/cvpr.2016.278 http://dx.doi.org/10.1109/cvpr.2016.278 ]
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S A, Huang Z H, Karpathy A, Khosla A, Bernstein M, Berg A C and Li F F. 2015. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 115(3): 211-252[DOI: 10.1007/s11263-015-0816-y]
Simonyan K and Zisserman A. 2015. Very deep convolutional networks for large-scale image recognition//Proceedings of the 3rd International Conference on Learning Representations. San Diego, USA: ICLR
Song L S, Cao J, Song L X, Hu Y B and He R. 2019. Geometry-aware face completion and editing//Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Honolulu, USA: AAAI: 2506-2513[ DOI: 10.1609/aaai.v33i01.33012506 http://dx.doi.org/10.1609/aaai.v33i01.33012506 ]
Sun J G, Yang Z W and Huang S. 2020. Image inpainting model with consistent global and local attributes. Journal of Image and Graphics, 25(12): 2505-2516
孙劲光, 杨忠伟, 黄胜. 2020. 全局与局部属性一致的图像修复模型. 中国图象图形学报, 25(12): 2505-2516[DOI: 10.11834/jig.190681]
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L and Polosukhin I. 2017. Attention is all you need//Proceedings of the 31st Conference on Neural Information Processing Systems. Long Beach, USA: NIPS: 6000-6010
Wang Z, Bovik A C, Sheikh H R and Simoncelli E P. 2004. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4): 600-612[DOI: 10.1109/tip.2003.819861]
Yi Z L, Tang Q, Azizi S, Jang D and Xu Z. 2020. Contextual residual aggregation for ultra high-resolution image inpainting//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 7505-7514[ DOI: 10.1109/cvpr42600.2020.00753 http://dx.doi.org/10.1109/cvpr42600.2020.00753 ]
Zhang H, Goodfellow I, Metaxas D and Odena A. 2019. Self-attention generative adversarial networks//Proceedings of the 36th International Conference on Machine Learning. Long Beach, USA: ICML: 7354-7363
Zhao S Y, Cui J, Sheng Y L, Dong Y, Liang X, Chang E I C and Xu Y. 2021. Large scale image completion via co-modulated generative adversarial networks[EB/OL]. [2021-11-22] . https://arxiv.org/pdf/2103.10428.pdf https://arxiv.org/pdf/2103.10428.pdf
Zheng C X, Cham T J and Cai J F. 2019. Pluralistic image completion//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 1438-1447[ DOI: 10.1109/cvpr.2019.00153 http://dx.doi.org/10.1109/cvpr.2019.00153 ]
Zheng S X, Lu J C, Zhao H S, Zhu X T, Luo Z K, Wang Y B, Fu Y W, Feng J F, Xiang T, Torr P H S and Zhang L. 2021a. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Virtual: IEEE: 6881-6890
Zheng C X,Cham T J, Cai J F and Phung D. 2021b. Bridging global context interactions for high-fidelity image completion[EB/OL]. [2021-11-22] . https://arxiv.org/pdf/2104.00845.pdf https://arxiv.org/pdf/2104.00845.pdf
Zhou Y Q, Barnes C, Shechtman E and Amirghodsi S. 2021. TransFill: reference-guided image inpainting by merging multiple color and spatial transformations//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Virtual: IEEE: 2266-2276[ DOI: 10.1109/CVPR46437.2021.00230 http://dx.doi.org/10.1109/CVPR46437.2021.00230 ]
相关作者
相关机构
京公网安备11010802024621