Current Issue Cover
流形正则化的交叉一致性语义分割算法

刘腊梅1, 宗佳旭1, 肖振久1, 兰海2, 曲海成1(1.辽宁工程技术大学软件学院,葫芦岛 125105;2.泉州装备制造研究所,泉州 362000)

摘 要
目的 为有效解决半监督及弱监督语义分割模型中上下文信息缺失问题,在充分考虑模型推理效率的基础上,提出基于流形正则化的交叉一致性语义分割算法。方法 首先,以交叉一致性训练模型作为骨架网络,通过骨架网络获得预测分割图像。其次,对输入域图像和输出域图像进行子图像块划分,以获取具有相同几何结构的数据对。再次,通过原始图像和分割图像的子图像块,计算输入数据与预测结果所处流形曲面上的潜在几何约束关系,并根据不同的训练方式分别设计半监督及弱监督的正则化算法。最后,利用流形约束的结果进一步优化图像分割网络中的参数,并通过反复迭代使半监督或弱监督的语义分割模型达到最优。结果 通过加入流形正则化约束,捕获了图像中上下文信息,降低了网络前向计算过程中造成的本征结构的损失,在不改变网络结构的前提下提高了算法精度。为验证算法的有效性,实验分别在半监督和弱监督两种不同类型的语义分割中进行了对比,在PASCAL VOC 2012(pattern analysis, statistical modeling and computational learning visual object classes 2012)数据集上,对半监督语义分割任务,本文算法比原始网络提高了3.7%,对弱监督语义分割任务,本文算法比原始网络提高了1.1%。结论 本文算法在不改变原有网络结构的基础上,提升了半监督及弱监督图像语义分割模型的精度,尤其对图像中几何特征明显的目标与区域,精度提升更加明显。
关键词
Cross-consistent semantic segmentation algorithm based on manifold regularization

Liu Lamei1, Zong Jiaxu1, Xiao Zhenjiu1, Lan Hai2, Qu Haicheng1(1.College of Software, Liaoning Technology University, Huludao 125105, China;2.Quanzhou Institute of Equipment Manufacturing, Quanzhou 362000, China)

Abstract
Objective Image semantic segmentation is a pixel-level classification-related issue, which divides each pixel into different categories in the image, which is a sort of extension and expansion of image classification. Its applications have included like scene information understanding, autonomous driving, and clinical diagnosis. However, deep learning models training requires a large amount of labeled data, and obtaining these data is time-consuming and labor-intensive in semantic segmentation. At present, deep semi-supervised learning is focused on to utilize a large amount of unlabeled data and limit the demand for labeled data. However, current methods are challenged for contextual information collection and constraints, and the existing methods for increasing contextual information often increase the network’s reasoning speed to varying degrees. So, we develop a semi-supervised semantic segmentation method with manifold regularization on the basis of cross-consistency training. Method Our research is assumed that the input data and its corresponding prediction results have the same geometric structure on the low-dimensional manifold surface in the high-dimensional original data space. The geometric data structure is used to construct regularization constraints based on this assumption. First, we design the penalty that a manifold regularization term is integrated to make single pixel information and neighborhood context information. This geometric perception is that the data in the original image have the same locally geometric shape in related to the segmented result. Next, the manifold regularization constraint method mentioned above is combined with the current mainstream semi-supervised and weakly-supervised image segmentation algorithms, which illustrates that our manifold regularization algorithm can well adapt to various different segmentation tasks. In the semi-supervised and weakly-supervised manifold regularization algorithms, a cutting-edged cross-consistency training model is selected as our skeleton network, and the semi-supervised training method of cross-consistency is given different forms of perturbation to the encoder output to strengthen the predictive invariance of the model. We use the open source toolbox Pytorch to build the model. The stochastic gradient descent (SGD) method is adopted as the optimization. The operating system of the experimental platform is Centos7, with a graphics processing unit (GPU) of model NVIDIA RTX 2080Ti and a CPU of Intel (R) Core (TM) i7-6850. Result By adding manifold regularization constraints, the contextual information is captured in the image, the loss of the intrinsic structure caused by the network is reduced forward calculation process, and the accuracy of the algorithm is improved. In order to verify the effectiveness of the algorithm, experiments are based on two different types of semi-supervised and weakly-supervised semantic segmentation. On the pattern analysis, statistical modeling and computational learning visual object classes 2012 (PASCAL VOC 2012) dataset, the semi-supervised semantic segmentation task is improved by 3.7% compared to the original network. Our weakly supervised semantic segmentation algorithm is improved by 1.1% compared with the original network. Furthermore, we implement visualization of the segmentation results on different models. It can be found that the segmentation results generated by manifold regularization constraints have more refined edges and less error rate. Conclusion Our algorithm is based on the contextual information through manifold regularization constraints, and is optimized in semi-supervised and weak-supervised tasks without changing the original network structure. The experimental results verify that our algorithm is potential to generalization and optimal ability.
Keywords

订阅号|日报