融合边缘保持与改进代价聚合的立体匹配算法

程德强; 李海翔; 寇旗旗; 于泽宽; 庄焕东; 吕晨

doi:10.11834/jig.200041

图像理解和计算机视觉 | 浏览量 : 0 下载量: 0 CSCD: 5

PDF
导出
分享
收藏
专辑

融合边缘保持与改进代价聚合的立体匹配算法
Stereo matching algorithm based on edge preservation and improved cost aggregation
2021年26卷第2期页码：438-451
纸质出版日期： 2021-02-16 ，

录用日期： 2020-06-05
DOI： 10.11834/jig.200041
稿件说明：

移动端阅览

程德强, 李海翔, 寇旗旗, 于泽宽, 庄焕东, 吕晨. 融合边缘保持与改进代价聚合的立体匹配算法[J]. 中国图象图形学报, 2021,26(2):438-451.

Deqiang Cheng, Haixiang Li, Qiqi Kou, Zekuan Yu, Huandong Zhuang, Chen Lyu. Stereo matching algorithm based on edge preservation and improved cost aggregation[J]. Journal of Image and Graphics, 2021,26(2):438-451.
程德强, 李海翔, 寇旗旗, 于泽宽, 庄焕东, 吕晨. 融合边缘保持与改进代价聚合的立体匹配算法[J]. 中国图象图形学报, 2021,26(2):438-451. DOI： 10.11834/jig.200041.

Deqiang Cheng, Haixiang Li, Qiqi Kou, Zekuan Yu, Huandong Zhuang, Chen Lyu. Stereo matching algorithm based on edge preservation and improved cost aggregation[J]. Journal of Image and Graphics, 2021,26(2):438-451. DOI： 10.11834/jig.200041.

摘要

目的

立体匹配是计算机双目视觉的重要研究方向，主要分为全局匹配算法与局部匹配算法两类。传统的局部立体匹配算法计算复杂度低，可以满足实时性的需要，但是未能充分利用图像的边缘纹理信息，因此在非遮挡、视差不连续区域的匹配精度欠佳。为此，提出了融合边缘保持与改进代价聚合的立体匹配。

方法

首先利用图像的边缘空间信息构建权重矩阵，与灰度差绝对值和梯度代价进行加权融合，形成新的代价计算方式，同时将边缘区域像素点的权重信息与引导滤波的正则化项相结合，并在多分辨率尺度的框架下进行代价聚合。所得结果经过视差计算，得到初始视差图，再通过左右一致性检测、加权中值滤波等视差优化步骤获得最终的视差图。

结果

在Middlebury立体匹配平台上进行实验，结果表明，融合边缘权重信息对边缘处像素点的代价量进行了更加有效地区分，能够提升算法在各区域的匹配精度。其中，未加入视差优化步骤的21组扩展图像对的平均误匹配率较改进前减少3.48%，峰值信噪比提升3.57 dB，在标准4幅图中venus上经过视差优化后非遮挡区域的误匹配率仅为0.18%。

结论

融合边缘保持的多尺度立体匹配算法有效提升了图像在边缘纹理处的匹配精度，进一步降低了非遮挡区域与视差不连续区域的误匹配率。

Abstract

Objective

Stereo matching is an important part of the field of binocular stereo vision. It reconstructs 3D objects or scenes through a pair of 2D images by simulating the visual system of human beings. Stereo matching is widely used in various fields

such as unmanned vehicles

3D noncontact measures

and robot navigation. Most stereo matching algorithms can be divided into two types: global and local stereo matching algorithms. A global algorithm obtains a disparity map by minimizing the energy function; it exhibits the advantage of high matching accuracy. However

a global stereo matching algorithm operates with high computational complexity

and it is difficult to apply to some fields that require programs to act fast. Local matching algorithms use only the neighborhood information of pixels in the window to perform pixel-by-pixel matching

and thus

its matching accuracy is lower than that of global algorithms. Local algorithms have lower computational complexity

expanding the application range of stereo matching. Local stereo matching algorithms generally have four steps: cost computation

cost aggregation

disparity computation

and disparity refinement. In cost computation

the cost value of each pixel in the left and right images is computed by the designed algorithm at all disparity levels. The correlation between the pixel to be matched and the candidate pixel is measured using the cost value; a smaller cost value corresponds to higher relevance. In cost aggregation

a local matching algorithm aggregates the cost value within a matching window by summing

averaging

or using other methods to obtain the cumulative cost value to reduce the impact of outliers. The disparity for each pixel is calculated using local optimization methods and refined using different post-processing methods in the last two steps. However

traditional local stereo matching algorithms cannot fully utilize the edge texture information of images. Thus

such algorithms still exhibit poor performance in matching accuracy in non-occluded regions and regions with disparity discontinuity. A multi-scale stereo matching algorithm based on edge preservation is proposed to meet the real-time requirements for a realistic scene and improve the matching accuracy of an algorithm in non-occluded regions and regions with disparity discontinuity.

Method

We use edge detection to obtain the edge matrix of an image. The values in the obtained edge image are filtered

reassigned

and normalized to obtain an edge weight matrix. In a traditional cost computation algorithm

the method of combining the absolute difference with the gradient fully utilizes the pixel relationship among three channels (R

and B) of an image. It exhibits limited improvement in regions with disparity discontinuity and can ensure rapid feature of the algorithm. However

higher matching accuracy cannot be guaranteed at edge regions. We fuse the obtained weight matrix with absolute difference and gradient transformation and then set a truncation threshold for the absolute difference and gradient transform algorithms to reduce the influence of the outlier on cost volume

finally forming a new cost computation function. The new cost computation function can provide a smaller cost volume to the pixels in the texture area belonging to the left and right images

and thus

it achieves better discrimination in edge regions. In cost aggregation

edge weight information is combined with the regularization term of a guided image filter to perform aggregation in a cross-scale framework. By changing the fixed regularization term of the guide filter

a larger smoothing factor is superimposed for pixels closer to the edge in the edge texture region of an image

whereas a smaller smoothing factor is superimposed for points farther away from the edge. Therefore

the points closer to the edge acquire a lower cost value. In the disparity computation

we select the point with the smallest cumulative cost value as the corresponding point to obtain the initial disparity map. This map is processed using disparity refinement methods

such as weighted median filter

hole assignment

and left-to-right consistency check

to obtain the final disparity map.

Result

We test the algorithm on the Middlebury stereo matching benchmark. Experimental results show that the fusion of texture weight information can more effectively distinguish the cost volume of pixels at the edge region and the number of mismatched pixels at the edge regions of the image is considerably reduced. Moreover

after fusing image information at different scales

the matching accuracy of an image in smooth areas is improved. The average error matching rate of the proposed algorithm is reduced by 3.48% compared with the original algorithm for 21 extended image pairs without any disparity refinement steps. The average error matching rate of the proposed algorithm is 5.77% for four standard image pairs on the Middlebury benchmark

which is better than those of the listed comparison algorithms. Moreover

the error matching rate of the proposed algorithm for venus image pairs in non-occluded regions is 0.18%

and the error matching rate in all the regions is 0.39%. The average peak signal-to-noise ratio of the proposed algorithm on 21 extended image pairs is 20.48 dB. The deviation extent of the pixel disparity of the obtained initial disparity map compared with the real disparity map is the smallest among the listed algorithms. The average running time of the proposed algorithm for 21 extended image pairs is 17.74 s. Compared with the original algorithm

the average running time of the proposed algorithm increases by 0.73 s and still maintains good real-time performance.

Conclusion

In this study

we propose a stereo matching algorithm based on edge preservation and an improved guided filter. The proposed stereo matching algorithm effectively improves the matching accuracy of an image in texture regions

further reducing the error matching rate in non-occluded regions and regions with disparity discontinuity.

关键词

计算机视觉局部立体匹配代价计算边缘保持引导滤波

Keywords

computer visionlocal stereo matchingcost computationedge preservationguided image filter

references

Felzenszwalb P F and Huttenlocher D R. 2004. Efficient belief propagation for early vision//Proceedings of 2014 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE: 261-268[DOI:10.1109/CVPR.2004.1315041http://dx.doi.org/10.1109/CVPR.2004.1315041]

He F and Da F P. 2011. Stereo matching using belief propagation and local edge construction-based cost aggregation. Journal of Image and Graphics, 16(11): 2060-2066

何栿, 达飞鹏. 2011.置信度传播和区域边缘构建的立体匹配算法.中国图象图形学报, 16(11): 2060-2066)[DOI: 10.11834/jig.20111116]

He K M, Sun J and Tang X O. 2013. Guided image filtering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(6): 1397-1409[DOI: 10.1109/TPAMI.2012.213]

Hosni A, Bleyer M and Gelautz M. 2013a. Secrets of adaptive support weight techniques for local stereo matching. Computer Vision and Image Understanding, 117(6): 620-632[DOI: 10.1016/j.cviu.2013.01.007]

Hosni A, Rhemann C, Bleyer M, Rother C and Gelautz M. 2013b. Fast cost-volume filtering for visual correspondence and beyond. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(2): 504-511[DOI: 10.1109/TPAMI.2012.156]

Howard A. 2008. Real-time stereo visual odometry for autonomous ground vehicles//Proceedings of 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems. Nice, France: IEEE: 3946-3952[DOI:10.1109/IROS.2008.4651147http://dx.doi.org/10.1109/IROS.2008.4651147]

Huang C and Zhao H Z. 2019. Semi-global stereo matching with adaptive window based on grayscale value. Journal of Image and Graphics, 24(8): 1381-1390

黄超, 赵华治. 2019.根据灰度值信息自适应窗口的半全局匹配.中国图象图形学报, 24(8): 1381-1390)[DOI: 10.11834/jig.180574]

Luo G E. 2012. Some Issues of Depth Perception and Three Dimension Reconstruction from Binocular Stereo Vision. Changsha: Central South University

罗桂娥. 2012.双目立体视觉深度感知与3维重建若干问题研究.长沙:中南大学

Ma L, Li J J and Ma J. 2014. Modified Census transform with related information of neighborhood for stereo matching algorithm. Computer Engineering and Applications, 50(24): 16-20, 46

马利, 李晶皎, 马技.邻域相关信息的改进Census变换立体匹配算法.计算机工程与应用, 50(24): 16-20, 46)[DOI:10.3778/j.issn.1002-8331.1405-0081http://dx.doi.org/10.3778/j.issn.1002-8331.1405-0081]

Martins J A, Rodrigues J M F and Du B H. 2015. Luminance, colour, viewpoint and border enhanced disparity energy model. PLoS One, 10(6): #e0129908[DOI: 10.1371/journal.pone.0129908]

Mei X, Sun X, Dong W M, Wang H T and Zhang X P. 2013. Segment-tree based cost aggregation for stereo matching//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA: IEEE: 313-320[DOI:10.1109/CVPR.2013.47http://dx.doi.org/10.1109/CVPR.2013.47]

Ploumpis S, Amanatiadis A and Gasteratos A. 2015. A stereo matching approach based on particle filters and scattered control landmarks. Image and Vision Computing, 38: 13-23[DOI: 10.1016/j.imavis.2015.04.001]

Scharstein D and Szeliski R. 2002. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision, 47(1/3): 7-42[DOI: 10.1023/A:1014573219977]

Van Meerbergen G, Vergauwen M, Pollefeys M and Van Gool L. 2002. A hierarchical symmetric stereo algorithm using dynamic programming. International Journal of Computer Vision, 47(1/3): 275-285[DOI: 10.1023/A:1014562312225]

Wang L, Yang R G, Gong M L and Liao M. 2014a. Real-time stereo using approximated joint bilateral filtering and dynamic programming. Journal of Real-Time Image Processing, 9(3): 447-461[DOI: 10.1007/s11554-012-0275-4]

Wang L Q, Liu Z and Zhang Z H. 2014b. Feature based stereo matching using two-step expansion. Mathematical Problems in Engineering, 2014: #452803[DOI: 10.1155/2014/452803]

Xie W, Zhou Y Q and You M. 2016. Improved guided image filtering integrated with gradient information. Journal of Image and Graphics, 21(9): 1119-1126

谢伟, 周玉钦, 游敏. 2016.融合梯度信息的改进引导滤波.中国图象图形学报, 21(9): 1119-1126)[DOI: 10.11834/jig.20160901]

Yang Q X. 2012. A non-local cost aggregation method for stereo matching//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA: 1402-1409[DOI:10.1109/CVPR.2012.6247827http://dx.doi.org/10.1109/CVPR.2012.6247827]

Yoon K J and Kweon I S. 2006. Adaptive support-weight approach for correspondence search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(4): 650-656[DOI: 10.1109/TPAMI.2006.70]

Zabih R and Woodfill J. 1994. Non-parametric local transforms for computing visual correspondence//Proceedings of the 3rd European Conference on Computer Vision. Stockholm, Sweden: 151-158[DOI:10.1007/BFb0028345http://dx.doi.org/10.1007/BFb0028345]

Zhang K, Fang Y Q, Min D B, Sun L F, Yang S Q, Yan S Q and Tian Q. 2014. Cross-scale cost aggregation for stereo matching//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE: 1590-1597[DOI:10.1109/CVPR.2014.206http://dx.doi.org/10.1109/CVPR.2014.206]

Zhu S P and Yan L N. 2017. Local stereo matching algorithm with efficient matching cost and adaptive guided image filter. The Visual Computer, 33(9): 1087-1102[DOI: 10.1007/s00371-016-1264-6]

文章被引用时，请邮件提醒。

提交

三维步态识别研究进展

分割一切模型SAM的潜力与展望：综述

“三维视觉—语言”推理技术的前沿研究与最新趋势

深度学习实时语义分割综述

正交约束多头自注意力的场景文本识别