流形正则化的交叉一致性语义分割算法

刘腊梅; 宗佳旭; 肖振久; 兰海; 曲海成

doi:10.11834/jig.210571

图像理解和计算机视觉 | 浏览量 : 0 下载量: 0 CSCD: 2

PDF
导出
分享
收藏
专辑

流形正则化的交叉一致性语义分割算法
Cross-consistent semantic segmentation algorithm based on manifold regularization
2022年27卷第12期页码：3542-3552
纸质出版日期： 2022-12-16 ，

录用日期： 2021-11-02
DOI： 10.11834/jig.210571
稿件说明：

移动端阅览

刘腊梅, 宗佳旭, 肖振久, 兰海, 曲海成. 流形正则化的交叉一致性语义分割算法[J]. 中国图象图形学报, 2022,27(12):3542-3552.

Lamei Liu, Jiaxu Zong, Zhenjiu Xiao, Hai Lan, Haicheng Qu. Cross-consistent semantic segmentation algorithm based on manifold regularization[J]. Journal of Image and Graphics, 2022,27(12):3542-3552.
刘腊梅, 宗佳旭, 肖振久, 兰海, 曲海成. 流形正则化的交叉一致性语义分割算法[J]. 中国图象图形学报, 2022,27(12):3542-3552. DOI： 10.11834/jig.210571.

Lamei Liu, Jiaxu Zong, Zhenjiu Xiao, Hai Lan, Haicheng Qu. Cross-consistent semantic segmentation algorithm based on manifold regularization[J]. Journal of Image and Graphics, 2022,27(12):3542-3552. DOI： 10.11834/jig.210571.

摘要

目的

为有效解决半监督及弱监督语义分割模型中上下文信息缺失问题，在充分考虑模型推理效率的基础上，提出基于流形正则化的交叉一致性语义分割算法。

方法

首先，以交叉一致性训练模型作为骨架网络，通过骨架网络获得预测分割图像。其次，对输入域图像和输出域图像进行子图像块划分，以获取具有相同几何结构的数据对。再次，通过原始图像和分割图像的子图像块，计算输入数据与预测结果所处流形曲面上的潜在几何约束关系，并根据不同的训练方式分别设计半监督及弱监督的正则化算法。最后，利用流形约束的结果进一步优化图像分割网络中的参数，并通过反复迭代使半监督或弱监督的语义分割模型达到最优。

结果

通过加入流形正则化约束，捕获了图像中上下文信息，降低了网络前向计算过程中造成的本征结构的损失，在不改变网络结构的前提下提高了算法精度。为验证算法的有效性，实验分别在半监督和弱监督两种不同类型的语义分割中进行了对比，在PASCAL VOC 2012(pattern analysis

statistical modeling and computational learning visual object classes 2012)数据集上，对半监督语义分割任务，本文算法比原始网络提高了3.7%，对弱监督语义分割任务，本文算法比原始网络提高了1.1%。

结论

本文算法在不改变原有网络结构的基础上，提升了半监督及弱监督图像语义分割模型的精度，尤其对图像中几何特征明显的目标与区域，精度提升更加明显。

Abstract

Objective

Image semantic segmentation is a pixel-level classification-related issue

which divides each pixel into different categories in the image

which is a sort of extension and expansion of image classification. Its applications have included like scene information understanding

autonomous driving

and clinical diagnosis. However

deep learning models training requires a large amount of labeled data

and obtaining these data is time-consuming and labor-intensive in semantic segmentation. At present

deep semi-supervised learning is focused on to utilize a large amount of unlabeled data and limit the demand for labeled data. However

current methods are challenged for contextual information collection and constraints

and the existing methods for increasing contextual information often increase the network's reasoning speed to varying degrees. So

we develop a semi-supervised semantic segmentation method with manifold regularization on the basis of cross-consistency training.

Method

Our research is assumed that the input data and its corresponding prediction results have the same geometric structure on the low-dimensional manifold surface in the high-dimensional original data space. The geometric data structure is used to construct regularization constraints based on this assumption. First

we design the penalty that a manifold regularization term is integrated to make single pixel information and neighborhood context information. This geometric perception is that the data in the original image have the same locally geometric shape in related to the segmented result. Next

the manifold regularization constraint method mentioned above is combined with the current mainstream semi-supervised and weakly-supervised image segmentation algorithms

which illustrates that our manifold regularization algorithm can well adapt to various different segmentation tasks. In the semi-supervised and weakly-supervised manifold regularization algorithms

a cutting-edged cross-consistency training model is selected as our skeleton network

and the semi-supervised training method of cross-consistency is given different forms of perturbation to the encoder output to strengthen the predictive invariance of the model. We use the open source toolbox Pytorch to build the model. The stochastic gradient descent (SGD) method is adopted as the optimization. The operating system of the experimental platform is Centos7

with a graphics processing unit (GPU) of model NVIDIA RTX 2080Ti and a CPU of Intel (R) Core (TM) i7-6850.

Result

By adding manifold regularization constraints

the contextual information is captured in the image

the loss of the intrinsic structure caused by the network is reduced forward calculation process

and the accuracy of the algorithm is improved. In order to verify the effectiveness of the algorithm

experiments are based on two different types of semi-supervised and weakly-supervised semantic segmentation. On the pattern analysis

statistical modeling and computational learning visual object classes 2012 (PASCAL VOC 2012) dataset

the semi-supervised semantic segmentation task is improved by 3.7% compared to the original network. Our weakly supervised semantic segmentation algorithm is improved by 1.1% compared with the original network. Furthermore

we implement visualization of the segmentation results on different models. It can be found that the segmentation results generated by manifold regularization constraints have more refined edges and less error rate.

Conclusion

Our algorithm is based on the contextual information through manifold regularization constraints

and is optimized in semi-supervised and weak-supervised tasks without changing the original network structure. The experimental results verify that our algorithm is potential to generalization and optimal ability.

关键词

深度学习语义分割半监督语义分割弱监督语义分割交叉一致性训练流形正则化

Keywords

deep learningsemantic segmentationsemi-supervised semantic segmentationweakly-supervised semantic segmentationcross-consistency trainingmanifold regularization

references

Belkin M, Niyogi P and Sindhwani V. 2006. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. The Journal of Machine Learning Research, 7: 2399-2434 [DOI: 10.5555/1248547.1248632]

Chen C, Tang S and Li J T. 2020. Weakly supervised semantic segmentation based on dynamic mask generation. Journal of Image and Graphics, 25(6): 1190-1200

陈辰, 唐胜, 李锦涛. 2020. 动态生成掩膜弱监督语义分割. 中国图象图形学报, 25(6): 1190-1200 [DOI: 10.11834/jig.190458]

Evgeniou T, Pontil M and Poggio T. 2000. Regularization networks and support vector machines. Advances in Computational Mathematics, 13(1): #1 [DOI: 10.1023/A:1018946025316]

Hong S, Noh H and Han B. 2015. Decoupled deep neural network for semi-supervised semantic segmentation//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, Canada: MIT Press: 1495-1503 [DOI: 10.5555/2969239.2969406http://dx.doi.org/10.5555/2969239.2969406]

Hu C, Wu X J, Shu Z Q and Chen S G. 2020. Laplacian ladder networks. Journal of Software, 31(5): 1525-1535

胡聪, 吴小俊, 舒振球, 陈素根. 2020. 拉普拉斯阶梯网络. 软件学报, 31(5): 1525-1535 [DOI: 10.13328/j.cnki.jos.005680]

Kwak S, Hong S and Han B. 2017. Weakly supervised semantic segmentation using superpixel pooling network//Proceedings of the 31st AAAI Conference on Artificial Intelligence. San Francisco, USA: AAAI: 4111-4117

Lee J, Kim E, Lee S, Lee J and Yoon S. 2019. FickleNet: weakly and semi-supervised semantic image segmentation using stochastic inference//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 5262-5271 [DOI: 10.1109/cvpr.2019.00541http://dx.doi.org/10.1109/cvpr.2019.00541]

Li Y, Liu Y, Liu G J and Guo M Z. 2020. Weakly supervised image semantic segmentation method based on object location cues. Journal of Software, 31(11): 3640-3656

李阳, 刘扬, 刘国军, 郭茂祖. 2020. 基于对象位置线索的弱监督图像语义分割方法. 软件学报, 31(11): 3640-3656 [DOI: 10.13328/j.cnki.jos.005828]

Li Y, Liu Y, Liu G J, Zhai D M and Guo M Z. 2018. Weakly supervised semantic segmentation based on EM algorithm with localization clues. Neurocomputing, 275: 2574-2587 [DOI: 10.1016/j.neucom.2017.11.029]

Lin T Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P and Zitnick C L. 2014. Microsoft COCO: common objects in context//Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland: Springer: 740-755 [DOI: 10.1007/978-3-319-10602-1_48http://dx.doi.org/10.1007/978-3-319-10602-1_48]

Liu B, Wu Z R, Hu H and Lin S. 2019. Deep metric transfer for label propagation with limited annotated data//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision Workshop. Seoul, Korea (South): IEEE: 1317-1326 [DOI: 10.1109/iccvw.2019.00167http://dx.doi.org/10.1109/iccvw.2019.00167]

Long J, Shelhamer E and Darrell T. 2015. Fully convolutional networks for semantic segmentation//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 3431-3440 [DOI: 10.1109/CVPR.2015.7298965http://dx.doi.org/10.1109/CVPR.2015.7298965]

Miyato T, Maeda S I, Koyama M and Ishii S. 2019. Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(8): 1979-1993 [DOI: 10.1109/tpami.2018.2858821]

Niyogi P. 2013. Manifold regularization and semi-supervised learning: some theoretical analyses. The Journal of Machine Learning Research, 14(1): 1229-1250 [DOI: 10.5555/2567709.2502619]

Ouali Y, Hudelot C and Tami M. 2020. Semi-supervised semantic segmentation with cross-consistency training//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 12671-12681 [DOI: 10.1109/cvpr42600.2020.01269http://dx.doi.org/10.1109/cvpr42600.2020.01269]

Qing C, Yu J, Xiao C B and Duan J. 2020. Deep convolutional neural network for semantic image segmentation. Journal of Image and Graphics, 25(6): 1069-1090

青晨, 禹晶, 肖创柏, 段娟. 2020. 深度卷积神经网络图像语义分割研究进展. 中国图象图形学报, 25(6): 1069-1090 [DOI: 10.11834/jig.190355]

Quispe A M and Petitjean C. 2015. Shape prior based image segmentation using manifold learning//Proceedings of 2015 International Conference on Image Processing Theory, Tools and Applications. Orleans, France: IEEE: 137-142 [DOI: 10.1109/IPTA.2015.7367113http://dx.doi.org/10.1109/IPTA.2015.7367113]

Saleh F, Aliakbarian M S, Salzmann M, Petersson L, Gould S and Alvarez J M. 2016. Built-in foreground/background prior for weakly-supervised semantic segmentation//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer: 413-432 [DOI: 10.1007/978-3-319-46484-8_25http://dx.doi.org/10.1007/978-3-319-46484-8_25]

Shimoda W and Yanai K. 2016. Distinct class-specific saliency maps for weakly supervised semantic segmentation//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer: 218-234 [DOI: 10.1007/978-3-319-46493-0_14http://dx.doi.org/10.1007/978-3-319-46493-0_14]

Song C F, Huang Y, Ouyang W L and Wang L. 2019. Box-driven class-wise region masking and filling rate guided loss for weakly supervised semantic segmentation//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 3131-3140 [DOI: 10.1109/cvpr.2019.00325http://dx.doi.org/10.1109/cvpr.2019.00325]

Souly N, Spampinato C and Shah M. 2017. Semi supervised semantic segmentation using generative adversarial network//Proceedings of 2017IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 5689-5697 [DOI: 10.1109/iccv.2017.606http://dx.doi.org/10.1109/iccv.2017.606]

Tarvainen A and Valpola H. 2017. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc. : 1195-1204 [DOI: 10.5555/3294771.3294885http://dx.doi.org/10.5555/3294771.3294885]

Wei Y C, Liang X D, Chen Y P, Jie Z Q, Xiao Y H, Zhao Y and Yan S C. 2016. Learning to segment with image-level annotations. Pattern Recognition, 59: 234-244 [DOI: 10.1016/j.patcog.2016.01.015]

Xu S J, Meng Y B, Liu G H, Yu J Q, Xiong F L and Hu G Z. 2019. Local region consistency manifold constrained MRF model for image. Control and Decision, 34(5): 997-1003

徐胜军, 孟月波, 刘光辉, 于军琪, 熊福力, 胡高珍. 2019. 用于图像分割的局部区域一致性流形约束MRF模型. 控制与决策, 34(5): 997-1003 [DOI: 10.13195/j.kzyjc.2017.1453]

Zhang M, Zhou Y, Zhao J Q, Man Y Y, Liu B and Yao R. 2020. A survey of semi- and weakly supervised semantic segmentation of images. Artificial Intelligence Review, 53(6): 4259-4288 [DOI: 10.1007/s10462-019-09792-7]

Zhou B L, Khosla A, Lapedriza A, Oliva A and Torralba A. 2016. Learning deep features for discriminative localization//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 2921-2929 [DOI: 10.1109/cvpr.2016.319http://dx.doi.org/10.1109/cvpr.2016.319]

文章被引用时，请邮件提醒。

提交

流形正则化约束的图像语义分割

基于深度学习的弱监督语义分割方法综述

语义分割和HSV色彩空间引导的低光照图像增强

大场景双视角点云特征融合语义分割方法

无人机航拍图像中电力线检测方法研究进展