Current Issue Cover
融合边缘保持与改进代价聚合的立体匹配算法

程德强1,2, 李海翔2, 寇旗旗3, 于泽宽4, 庄焕东2, 吕晨2(1.中国矿业大学地下空间智能控制教育部工程研究中心, 徐州 221000;2.中国矿业大学信息与控制工程学院, 徐州 221000;3.中国矿业大学计算机科学与技术学院, 徐州 221000;4.复旦大学工程与应用技术研究院, 上海 200433)

摘 要
目的 立体匹配是计算机双目视觉的重要研究方向,主要分为全局匹配算法与局部匹配算法两类。传统的局部立体匹配算法计算复杂度低,可以满足实时性的需要,但是未能充分利用图像的边缘纹理信息,因此在非遮挡、视差不连续区域的匹配精度欠佳。为此,提出了融合边缘保持与改进代价聚合的立体匹配。方法 首先利用图像的边缘空间信息构建权重矩阵,与灰度差绝对值和梯度代价进行加权融合,形成新的代价计算方式,同时将边缘区域像素点的权重信息与引导滤波的正则化项相结合,并在多分辨率尺度的框架下进行代价聚合。所得结果经过视差计算,得到初始视差图,再通过左右一致性检测、加权中值滤波等视差优化步骤获得最终的视差图。结果 在Middlebury立体匹配平台上进行实验,结果表明,融合边缘权重信息对边缘处像素点的代价量进行了更加有效地区分,能够提升算法在各区域的匹配精度。其中,未加入视差优化步骤的21组扩展图像对的平均误匹配率较改进前减少3.48%,峰值信噪比提升3.57 dB,在标准4幅图中venus上经过视差优化后非遮挡区域的误匹配率仅为0.18%。结论 融合边缘保持的多尺度立体匹配算法有效提升了图像在边缘纹理处的匹配精度,进一步降低了非遮挡区域与视差不连续区域的误匹配率。
关键词
Stereo matching algorithm based on edge preservation and improved cost aggregation

Cheng Deqiang1,2, Li Haixiang2, Kou Qiqi3, Yu Zekuan4, Zhuang Huandong2, Lyu Chen2(1.Engineering Research Center of Intelligent Control for Underground Space, Ministry of Education, China University of Mining and Technology, Xuzhou 221000, China;2.School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221000, China;3.School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221000, China;4.Academy for Engineering and Technology, Fudan University, Shanghai 200433, China)

Abstract
Objective Stereo matching is an important part of the field of binocular stereo vision. It reconstructs 3D objects or scenes through a pair of 2D images by simulating the visual system of human beings. Stereo matching is widely used in various fields, such as unmanned vehicles, 3D noncontact measures, and robot navigation. Most stereo matching algorithms can be divided into two types: global and local stereo matching algorithms. A global algorithm obtains a disparity map by minimizing the energy function; it exhibits the advantage of high matching accuracy. However, a global stereo matching algorithm operates with high computational complexity, and it is difficult to apply to some fields that require programs to act fast. Local matching algorithms use only the neighborhood information of pixels in the window to perform pixel-by-pixel matching, and thus, its matching accuracy is lower than that of global algorithms. Local algorithms have lower computational complexity, expanding the application range of stereo matching. Local stereo matching algorithms generally have four steps: cost computation, cost aggregation, disparity computation, and disparity refinement. In cost computation, the cost value of each pixel in the left and right images is computed by the designed algorithm at all disparity levels. The correlation between the pixel to be matched and the candidate pixel is measured using the cost value; a smaller cost value corresponds to higher relevance. In cost aggregation, a local matching algorithm aggregates the cost value within a matching window by summing, averaging, or using other methods to obtain the cumulative cost value to reduce the impact of outliers. The disparity for each pixel is calculated using local optimization methods and refined using different post-processing methods in the last two steps. However, traditional local stereo matching algorithms cannot fully utilize the edge texture information of images. Thus, such algorithms still exhibit poor performance in matching accuracy in non-occluded regions and regions with disparity discontinuity. A multi-scale stereo matching algorithm based on edge preservation is proposed to meet the real-time requirements for a realistic scene and improve the matching accuracy of an algorithm in non-occluded regions and regions with disparity discontinuity. Method We use edge detection to obtain the edge matrix of an image. The values in the obtained edge image are filtered, reassigned, and normalized to obtain an edge weight matrix. In a traditional cost computation algorithm, the method of combining the absolute difference with the gradient fully utilizes the pixel relationship among three channels (R, G, and B) of an image. It exhibits limited improvement in regions with disparity discontinuity and can ensure rapid feature of the algorithm. However, higher matching accuracy cannot be guaranteed at edge regions. We fuse the obtained weight matrix with absolute difference and gradient transformation and then set a truncation threshold for the absolute difference and gradient transform algorithms to reduce the influence of the outlier on cost volume, finally forming a new cost computation function. The new cost computation function can provide a smaller cost volume to the pixels in the texture area belonging to the left and right images, and thus, it achieves better discrimination in edge regions. In cost aggregation, edge weight information is combined with the regularization term of a guided image filter to perform aggregation in a cross-scale framework. By changing the fixed regularization term of the guide filter, a larger smoothing factor is superimposed for pixels closer to the edge in the edge texture region of an image, whereas a smaller smoothing factor is superimposed for points farther away from the edge. Therefore, the points closer to the edge acquire a lower cost value. In the disparity computation, we select the point with the smallest cumulative cost value as the corresponding point to obtain the initial disparity map. This map is processed using disparity refinement methods, such as weighted median filter, hole assignment, and left-to-right consistency check, to obtain the final disparity map. Result We test the algorithm on the Middlebury stereo matching benchmark. Experimental results show that the fusion of texture weight information can more effectively distinguish the cost volume of pixels at the edge region and the number of mismatched pixels at the edge regions of the image is considerably reduced. Moreover, after fusing image information at different scales, the matching accuracy of an image in smooth areas is improved. The average error matching rate of the proposed algorithm is reduced by 3.48% compared with the original algorithm for 21 extended image pairs without any disparity refinement steps. The average error matching rate of the proposed algorithm is 5.77% for four standard image pairs on the Middlebury benchmark, which is better than those of the listed comparison algorithms. Moreover, the error matching rate of the proposed algorithm for venus image pairs in non-occluded regions is 0.18%, and the error matching rate in all the regions is 0.39%. The average peak signal-to-noise ratio of the proposed algorithm on 21 extended image pairs is 20.48 dB. The deviation extent of the pixel disparity of the obtained initial disparity map compared with the real disparity map is the smallest among the listed algorithms. The average running time of the proposed algorithm for 21 extended image pairs is 17.74 s. Compared with the original algorithm, the average running time of the proposed algorithm increases by 0.73 s and still maintains good real-time performance. Conclusion In this study, we propose a stereo matching algorithm based on edge preservation and an improved guided filter. The proposed stereo matching algorithm effectively improves the matching accuracy of an image in texture regions, further reducing the error matching rate in non-occluded regions and regions with disparity discontinuity.
Keywords

订阅号|日报