发布时间: 2021-02-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.200041
2021 | Volume 26 | Number 2

图像理解和计算机视觉

融合边缘保持与改进代价聚合的立体匹配算法

程德强^1,2, 李海翔², 寇旗旗³, 于泽宽⁴, 庄焕东², 吕晨²

1. 中国矿业大学地下空间智能控制教育部工程研究中心, 徐州 221000;

2. 中国矿业大学信息与控制工程学院, 徐州 221000;

3. 中国矿业大学计算机科学与技术学院, 徐州 221000;

4. 复旦大学工程与应用技术研究院, 上海 200433

收稿日期: 2020-02-08; 修回日期: 2020-05-29; 预印本日期: 2020-06-05

基金项目: 国家自然科学基金项目(51774281)

作者简介: 程德强, 1979年生, 男, 教授, 主要研究方向为机器视觉与模式识别、图像智能检测与信息处理。E-mail:chengdq@cumt.edu.cn;
李海翔, 通信作者, 男, 硕士研究生, 主要研究方向为模式识别与3维重建。E-mail:hjcs_57@163.com;
寇旗旗, 男, 讲师, 主要研究方向为图像处理与模式识别。E-mail:kouqiqi@cumt.edu.cn;
于泽宽, 男, 博士后, 主要研究方向为医学影像处理。E-mail:yzk@fudan.edu.cn;
庄焕东, 男, 硕士研究生, 主要研究方向为立体匹配与图像处理。E-mail:hdzhuang@cumt.edu.cn;
吕晨, 男, 硕士研究生, 主要研究方向为模式识别。E-mail:286562685@qq.com

中图法分类号: TN911.73

文献标识码: A

摘要

目的立体匹配是计算机双目视觉的重要研究方向，主要分为全局匹配算法与局部匹配算法两类。传统的局部立体匹配算法计算复杂度低，可以满足实时性的需要，但是未能充分利用图像的边缘纹理信息，因此在非遮挡、视差不连续区域的匹配精度欠佳。为此，提出了融合边缘保持与改进代价聚合的立体匹配。方法首先利用图像的边缘空间信息构建权重矩阵，与灰度差绝对值和梯度代价进行加权融合，形成新的代价计算方式，同时将边缘区域像素点的权重信息与引导滤波的正则化项相结合，并在多分辨率尺度的框架下进行代价聚合。所得结果经过视差计算，得到初始视差图，再通过左右一致性检测、加权中值滤波等视差优化步骤获得最终的视差图。结果在Middlebury立体匹配平台上进行实验，结果表明，融合边缘权重信息对边缘处像素点的代价量进行了更加有效地区分，能够提升算法在各区域的匹配精度。其中，未加入视差优化步骤的21组扩展图像对的平均误匹配率较改进前减少3.48%，峰值信噪比提升3.57 dB，在标准4幅图中venus上经过视差优化后非遮挡区域的误匹配率仅为0.18%。结论融合边缘保持的多尺度立体匹配算法有效提升了图像在边缘纹理处的匹配精度，进一步降低了非遮挡区域与视差不连续区域的误匹配率。

关键词

计算机视觉; 局部立体匹配; 代价计算; 边缘保持; 引导滤波

Stereo matching algorithm based on edge preservation and improved cost aggregation

Cheng Deqiang^1,2, Li Haixiang², Kou Qiqi³, Yu Zekuan⁴, Zhuang Huandong², Lyu Chen²

1. Engineering Research Center of Intelligent Control for Underground Space, Ministry of Education, China University of Mining and Technology, Xuzhou 221000, China;

2. School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221000, China;

3. School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221000, China;

4. Academy for Engineering and Technology, Fudan University, Shanghai 200433, China

Supported by: National Natural Science Foundation of China (51774281)

Abstract

Objective Stereo matching is an important part of the field of binocular stereo vision. It reconstructs 3D objects or scenes through a pair of 2D images by simulating the visual system of human beings. Stereo matching is widely used in various fields, such as unmanned vehicles, 3D noncontact measures, and robot navigation. Most stereo matching algorithms can be divided into two types: global and local stereo matching algorithms. A global algorithm obtains a disparity map by minimizing the energy function; it exhibits the advantage of high matching accuracy. However, a global stereo matching algorithm operates with high computational complexity, and it is difficult to apply to some fields that require programs to act fast. Local matching algorithms use only the neighborhood information of pixels in the window to perform pixel-by-pixel matching, and thus, its matching accuracy is lower than that of global algorithms. Local algorithms have lower computational complexity, expanding the application range of stereo matching. Local stereo matching algorithms generally have four steps: cost computation, cost aggregation, disparity computation, and disparity refinement. In cost computation, the cost value of each pixel in the left and right images is computed by the designed algorithm at all disparity levels. The correlation between the pixel to be matched and the candidate pixel is measured using the cost value; a smaller cost value corresponds to higher relevance. In cost aggregation, a local matching algorithm aggregates the cost value within a matching window by summing, averaging, or using other methods to obtain the cumulative cost value to reduce the impact of outliers. The disparity for each pixel is calculated using local optimization methods and refined using different post-processing methods in the last two steps. However, traditional local stereo matching algorithms cannot fully utilize the edge texture information of images. Thus, such algorithms still exhibit poor performance in matching accuracy in non-occluded regions and regions with disparity discontinuity. A multi-scale stereo matching algorithm based on edge preservation is proposed to meet the real-time requirements for a realistic scene and improve the matching accuracy of an algorithm in non-occluded regions and regions with disparity discontinuity. Method We use edge detection to obtain the edge matrix of an image. The values in the obtained edge image are filtered, reassigned, and normalized to obtain an edge weight matrix. In a traditional cost computation algorithm, the method of combining the absolute difference with the gradient fully utilizes the pixel relationship among three channels (R, G, and B) of an image. It exhibits limited improvement in regions with disparity discontinuity and can ensure rapid feature of the algorithm. However, higher matching accuracy cannot be guaranteed at edge regions. We fuse the obtained weight matrix with absolute difference and gradient transformation and then set a truncation threshold for the absolute difference and gradient transform algorithms to reduce the influence of the outlier on cost volume, finally forming a new cost computation function. The new cost computation function can provide a smaller cost volume to the pixels in the texture area belonging to the left and right images, and thus, it achieves better discrimination in edge regions. In cost aggregation, edge weight information is combined with the regularization term of a guided image filter to perform aggregation in a cross-scale framework. By changing the fixed regularization term of the guide filter, a larger smoothing factor is superimposed for pixels closer to the edge in the edge texture region of an image, whereas a smaller smoothing factor is superimposed for points farther away from the edge. Therefore, the points closer to the edge acquire a lower cost value. In the disparity computation, we select the point with the smallest cumulative cost value as the corresponding point to obtain the initial disparity map. This map is processed using disparity refinement methods, such as weighted median filter, hole assignment, and left-to-right consistency check, to obtain the final disparity map. Result We test the algorithm on the Middlebury stereo matching benchmark. Experimental results show that the fusion of texture weight information can more effectively distinguish the cost volume of pixels at the edge region and the number of mismatched pixels at the edge regions of the image is considerably reduced. Moreover, after fusing image information at different scales, the matching accuracy of an image in smooth areas is improved. The average error matching rate of the proposed algorithm is reduced by 3.48% compared with the original algorithm for 21 extended image pairs without any disparity refinement steps. The average error matching rate of the proposed algorithm is 5.77% for four standard image pairs on the Middlebury benchmark, which is better than those of the listed comparison algorithms. Moreover, the error matching rate of the proposed algorithm for venus image pairs in non-occluded regions is 0.18%, and the error matching rate in all the regions is 0.39%. The average peak signal-to-noise ratio of the proposed algorithm on 21 extended image pairs is 20.48 dB. The deviation extent of the pixel disparity of the obtained initial disparity map compared with the real disparity map is the smallest among the listed algorithms. The average running time of the proposed algorithm for 21 extended image pairs is 17.74 s. Compared with the original algorithm, the average running time of the proposed algorithm increases by 0.73 s and still maintains good real-time performance. Conclusion In this study, we propose a stereo matching algorithm based on edge preservation and an improved guided filter. The proposed stereo matching algorithm effectively improves the matching accuracy of an image in texture regions, further reducing the error matching rate in non-occluded regions and regions with disparity discontinuity.

Key words

computer vision; local stereo matching; cost computation; edge preservation; guided image filter

0 引言

双目立体视觉是计算机视觉领域的重要分支之一，其模拟生物视觉系统的原理感知客观世界(罗桂娥，2012)。立体匹配是双目立体视觉的重要组成部分，通过计算两幅图像间的像素视差以获取匹配点信息，从而恢复出3维物体或场景的几何信息与深度信息。在无人驾驶(Howard，2008)、3维非接触式测量和虚拟现实等领域有着广泛应用。Scharstein和Szeliski(2002)对立体匹配典型算法进行了较全面的论述，将立体匹配的主要方法归为全局算法与局部算法两类。经典的全局算法包括动态规划匹配法(development plan matching，DP)(van Meerbergen等，2002)、置信传播(belief propagation，BP)(Felzenszwalb和Huttenlocher，2004)、最小生成树(minimum spanning trees，MST)(Yang，2012)等。全局算法通过最小化能量函数与多次迭代来获取视差图，具有匹配精度高的优点，但该类算法的计算复杂度较高且耗时较长，难以满足实时性的应用。局部的立体匹配算法逐像素地使用支持窗口中像素的邻域信息进行匹配，其匹配精度弱于全局算法，但是计算复杂度降低，提高了运行速率和实时性，扩大了立体匹配的应用范围。

局部立体匹配算法大体可以分为代价计算、代价聚合、视差计算与视差优化(Scharstein和Szeliski，2002)等4步。主流的代价计算方法主要包括基于灰度差绝对值(absolute intensity differences，AD)、灰度平方差(sum of absolute differences，SAD)和自适应权重(adapt weight，AW)(Yoon和Kweon，2006)。Zabih和Woodfill(1994)提出基于非参数变换的代价计算方式Census与Rank，利用汉明距离来描述点之间的差异，但是在图像重复区域效果较差；基于灰度差绝对值的方法(AD)能充分利用图像的通道像素关系，较Census变换在重复区域匹配精度有所提高，且计算复杂度较低；Hosni等人(2013a)将灰度差绝对值与梯度融合为新的代价函数，对幅度失真有较好的抵抗能力，但是在边缘纹理处的匹配精度依然不佳，在图像的非遮挡与视差不连续区域易造成误匹配。

代价计算得出的原始代价量易受噪声等异常点的影响，寻找的对应像素点的代价值常常不是最低。对支持窗口内像素的初始代价进行聚合从而得到某一点的累积代价，会增加视差图结果在支持窗口内像素的分段连贯性。针对上述问题，Yoon和Kweon(2006)将双边滤波引入代价聚合，并在较大的窗口上直接聚合，虽然匹配精度得到一定的提高但是计算量较大；Yang(2012)通过将核大小拓展到整幅图像，提出了non-local方法。Hosni等人(2013b)提出使用图像引导滤波的方法，其计算复杂度与支持窗口大小无关，得到了广泛应用，但是仍存在不能对边缘纹理处像素的代价量进行有效区分的问题；Zhang等人(2014)提出了区别于传统单一分辨率尺度下进行聚合的跨尺度代价聚合框架，通过融合不同尺度间的累积代价，取得了良好效果。本文提出边缘保持的代价计算算法，且利用边缘权重值改进引导滤波的正则化项，并与跨尺度的框架进行融合，以解决传统方法不能充分利用图像边缘信息、对边缘区域像素代价值不能有效区分的问题。

1 算法描述

为进一步提高对图像边缘空间信息的利用和提升匹配精度，在代价计算阶段，本文使用边缘权重融合传统的代价计算方法，与图像灰度差绝对值和梯度代价相结合，逐像素计算输入的两幅图像的代价量。在代价聚合阶段，利用边缘区域像素点的权重信息改变传统的引导滤波的正则化项，使之产生不同的平滑效果。然后采用WTA(winner-take-all)策略进行视差计算，检索视差范围内累计代价值最小的点作为对应的匹配点，与其对应的视差即为所求，生成初始视差图。视差优化分别使用以左右图像作为参考图像生成的视差图进行左右一致性检测，获取遮挡点、噪声点及误匹配点，通过插值、加权中值滤波、亚像素增强，最终获得优化后的视差图。

1.1 代价计算

通过代价计算，可以求出参考图像${\boldsymbol{I}}$上每一点的代价值，并以所有的视差可能性去匹配目标图像${{\boldsymbol{I}}^\prime }$上对应点的代价值，然后将得到的代价量保存在3维数组$f$:R^W×H×3×R^W×H×3→R^W×H×D中，其中，W、H与3分别代表输入图像的宽、高以及颜色通道数，D表示视差搜索范围。图像${\boldsymbol{I}}$与${{\boldsymbol{I}}^\prime }$的匹配代价计算函数为

$ \mathit{\boldsymbol{C}} = f(\mathit{\boldsymbol{I}}, {\mathit{\boldsymbol{I}}^\prime }) $

(1)

式中，C∈R^W×H×D表示像素在所有可能视差下的代价量。

边缘空间信息是图像的重要特征之一，传统代价计算方法未考虑使图像边缘区域的像素点具有更好的区分度。灰度差绝对值与梯度变化结合的代价计算方式能利用图像R、G、B三通道的颜色信息，实时性好，使视差不连续区域的匹配精度有一定的提升，但是依然不能很好地突出边缘纹理处像素代价的差异。基于以上分析，本文提出融合边缘保持的代价计算方法，并结合灰度差绝对值加梯度的方法构建代价计算函数。

为了获取图像的边缘空间信息，首先对左右图像进行均值滤波处理，以减弱颗粒噪声与细小纹理对计算代价值的影响。随后对左右视图进行Canny边缘检测，得到初始边缘纹理图像I^grad。对纹理图像中处于边缘(edge)与非边缘(smooth)区域的像素点设置阈值${n_e}$，生成的二值矩阵V保留了主要边缘区域的空间位置信息。矩阵V中(x, y)处的值为

$ {V_{x, y}} = \left\{ {\begin{array}{*{20}{l}} {{n_e}}&{I_{x, y}^{{\rm{grad}}} > 0}\\ 1&{其他} \end{array}} \right. $

(2)

式中，${V_{x, y}}$、$I_{x, y}^{{\rm{grad}}}$分别表示矩阵V和边缘纹理图像I^grad中(x, y)处的值。

图像中处于不同边缘纹理结构的像素的邻域包含的边缘点与非边缘点的个数一般是不同的，为了量化图像中像素点的空间位置信息，对二值矩阵V进行窗口长度为$n$的均值滤波，得到中间权重矩阵F。本文利用Middlebury数据集(Scharstein和Szeliski，2002)中的teddy图像进行实验，并给出了V与F的可视化结果，如图 1所示。在图 1(b)中，滤波窗口长度取7，所得的矩阵F中数值较大的点呈现橙红色，表明其在滤波窗口内包含了较多边缘区域像素，反之则表现为深蓝色。

图 1 二值矩阵V与中间权重矩阵F示意图

Fig. 1 Diagrams of binary matrix V and intermediate weight matrix F((a) binary matrix V; (b) intermediate weight matrix F)

为了扩大中间权重矩阵F中像素空间位置量化值的差异，使权重值有更明显的区分，本文设置参数对所得初始滤波结果进行标准化处理，构成代价量，具体为

$ \begin{array}{l} {G_{x, y}} = \frac{{\frac{1}{{{n^2}}}\sum\limits_{k = - \frac{{n - 1}}{2}}^{\frac{{n - 1}}{2}} {\sum\limits_{k = - \frac{{n - 1}}{2}}^{\frac{{n - 1}}{2}} {{V_{x + k, y + h}} - {F_{{\rm{min}}}}} } }}{{{F_{{\rm{max}}}} - {F_{{\rm{min}}}}}} \times \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} ({s_{{\rm{max}}}} - {s_{{\rm{min}}}}) + {s_{{\rm{min}}}} \end{array} $

(3)

式中，${F_{{\rm{max}}}}$和${F_{{\rm{min}}}}$为矩阵F中的最大与最小值，${G_{x, y}}$为权重矩阵G中点(x, y)处的值，${s_{{\rm{max}}}}$与${s_{{\rm{min}}}}$为标准化参数。

传统的灰度差绝对值与梯度结合的方式使用左右图像三通道的灰度差绝对值与x方向的梯度值构建代价函数，具体为

$ {C_1}(i, d) = {\partial _1} \cdot {\rm{min}} [\left\| {I(i) - {I^\prime }({i_d})} \right\|, {\tau _c}] $

(4)

$ {C_2}(i, d) = {\partial _2} \cdot {\rm{min}}[\left\| {{\nabla _x}I(i) - {\nabla _x}{I^\prime }({i_d})} \right\|, {\tau _g}] $

(5)

式中，${C_1}(i, d)$和${C_2}(i, d)$表示像素$i$在图像灰度差绝对值与梯度处于视差$d$时的匹配代价值，$I(i)$表示参考视图中像素$i$处的颜色分量，${{\nabla _x}I(i)}$表示参考视图中像素$i$沿$x$方向的梯度值，${{i_d}}$表示$i$在视差等于$d$处的对应点。${\partial _1}$与${\partial _2}$用于平衡灰度差绝对值与梯度对总代价量的影响。${\tau _c}$和${\tau _g}$为截断阈值，用于削弱异常代价量对结果的影响。

由于目标视图上的对应点较视差范围内的其他邻域像素有着与参考视图上的待匹配点大小更加相近的边缘权重值，因此可以获得更小的代价量。使用边缘权重矩阵构建代价函数，并与灰度差绝对值及梯度变换融合，形成新的代价计算函数，具体为

$ {C_3}(i, d) = {\partial _3} \cdot \left\| {G(i) - {G^\prime }({i_d})} \right\| $

(6)

$ C(i, d) = {C_1}(i, d) + {C_2}(i, d) + {C_3}(i, d) $

(7)

式中，${C_3}(i, d)$为边缘权重匹配代价量，$C(i, d)$为像素$i$在视差为$d$时的总代价量。${\partial _3}$为调节边缘权重代价对总代价量影响的权重系数。

在teddy参考视图上选取某一点$i$，分别计算在原代价计算函数${C_1}(i, d)$ + ${C_2}(i, d)$、新代价函数$C(i, d)$及${C_3}(i, d)$下的代价量，结果以折线图的形式给出，如图 2所示。从图 2可以看出，与视差搜索范围内的其他点相比，$i$与i′具有更加相似的权重代价值，改进后的算法使原误匹配点处的代价值增加，从而遴选出正确的对应点(红色虚线处)。

图 2 改进代价计算对代价量的影响

Fig. 2 The effect of improved cost computation on cost values ((a)point i in left view; (b)point i in G; (c)point i′ in G′; (d)statistics of cost value in disparity level)

为进一步验证融合边缘空间信息的代价计算方式对位于边缘像素点的良好区分能力，本文选取传统代价计算算法与所提算法在teddy图像上进行实验对比。实验采用相同的代价聚合方式，且未进行视差优化步骤，对比结果如图 3所示。可以看出，在图像的边缘纹理区域(红色框与蓝色框)，本文算法使像素匹配的错误率得到了抑制，优于传统的代价计算方式。

图 3 不同代价计算算法所得初始视差图

Fig. 3 Initial disparity maps obtained by different cost computation algorithms((a)traditional cost computation algorithms; (b)cost computation based on edge weight)

1.2 代价聚合

传统的引导滤波代价聚合方式，利用参考图像I作为引导图像指导滤波过程，聚合后的代价矩阵q被定义为与参考图像I相关的线性模型(谢伟等，2016)，具体为

$ {q_i} = {a_k}{I_i} + {b_k}, \forall i \in {\mathit{\boldsymbol{\omega }}_k} $

(8)

式中，${q_i}$和${I_i}$分别为代价矩阵q和参考图像I中像素$i$处的值，${a_k}$与${b_k}$是参考图像在以像素点$k$为中心、半径为$r$的局部窗口${\mathit{\boldsymbol{\omega }}_k}$内的线性系数，具体为

$ {a_k} = \left({\frac{1}{{|\mathit{\boldsymbol{\omega }}|}}\sum\limits_{i \in {\mathit{\boldsymbol{\omega }}_k}} {{I_i}{p_i} - {\mu _k}\overline {{p_k}} } } \right)/(\sigma _k^2 + \varepsilon) $

(9)

$ {b_k} = \overline {{p_k}} - {a_k}{\mu _k} $

(10)

式中，${\mu _k}$和$\sigma _k^2$是参考图像I中像素$k$在局部窗口${{{\boldsymbol{ \pmb{\mathsf{ ω}} }}_k}}$内的均值与方差。${|{\boldsymbol{ \pmb{\mathsf{ ω}} }}|}$为窗口内像素总数，${{p_i}}$和${\overline {{p_k}} }$分别为初始代价矩阵p在$i$处的值和窗口${{{\boldsymbol{ \pmb{\mathsf{ ω}} }}_k}}$内的均值。$\varepsilon $为矫正${a_k}$过大的正则化参数。

对式(8)两边求导, 可得▽q_i=a_k▽I_i, 表明此线性模型具有使初始代价矩阵p保留和体现参考图像I边缘的特质，且${a_k}$决定了梯度即边缘保持程度。传统的引导滤波定义固定的正则化项$\varepsilon $，对边缘区域和平坦区域各自的平滑力度${a_k}$的约束均相等。为了提高边缘纹理区域的匹配精度，本文采用基于边缘权重改进正则化项$\varepsilon $的引导滤波器，对越靠近边缘的像素点叠加更大的平滑倍数，以获得更小的代价量，改进后的引导滤波线性岭回归模型E (He等，2013)为

$ E = \sum\limits_{i \in {\mathit{\boldsymbol{\omega }}_k}} {((} a_k^\prime {I_i} + b_k^\prime - {p_i}{)^2} + \varepsilon G(i)a_k^{\prime 2}) $

(11)

求解式(11)，可得改进后的线性系数$a_k^\prime $和$b_k^\prime $，由此得到改进滤波后的代价量${C^\prime }(i, d)$，具体为

$ {C^\prime }(i, d) = \frac{1}{{|\mathit{\boldsymbol{\omega }}|}}\sum\limits_{i \in {\mathit{\boldsymbol{\omega }}_k}} {[a_k^\prime C(i, d) + b_k^\prime ]} $

(12)

本文在不同的尺度下分别实施改进的引导滤波，与Zhang等人(2014)提出的跨尺度聚合模型进行融合，即

$ \begin{array}{*{20}{c}} {\mathit{\boldsymbol{\hat v}} = }\\ {\arg \mathop {\min }\limits_{\left\{ {{z^s}} \right\}_{s = 0}^S} \left({\sum\limits_{s = 0}^S {\frac{1}{{Z_{{i^s}}^S}}} \sum\limits_{{j^s} \in {\mathit{\boldsymbol{N}}_{{i^s}}}} K \left({{i^s}, {j^s}} \right){{\left\| {{z^s} - {\mathit{\boldsymbol{C}}^s}\left({{j^s}, {d^s}} \right)} \right\|}^2} + } \right.}\\ {\left. {\lambda \sum\limits_{s = 1}^S {{{\left\| {{z^s} - {z^{s - 1}}} \right\|}^2}} } \right)} \end{array} $

(13)

式中，s∈{0, 1, …, S}代表不同的尺度，本文取S=4。${{j^s}}$为尺度$S$下待滤波点${{i^s}}$的邻域${{{\boldsymbol{N}}_{{i^s}}}}$内的像素点。${Z_{{i^s}}^S = \sum\limits_{{j^s} \in {{\boldsymbol{N}}_{{i^s}}}} K \left({{i^s}, {j^s}} \right)}$为归一化常数。${{{\boldsymbol{C}}^s}\left({{j^s}, {d^s}} \right)}$为尺度$s$下邻域点${{j^s}}$在视差为${{d^s}}$时的总代价量。$\lambda $为尺度间约束的正则化常量参数，依据Zhang等人(2014)的结论取值为0.3。$\mathit{\boldsymbol{\hat v}}$为包含不同尺度下${{i^s}}$在视差${{d^s}}$处经过尺度间约束后的滤波结果，即${{z^s}}$的向量。

$K\left({{i^s}, {j^s}} \right)$为尺度$s$下改进引导滤波的相似性核，具体为

$ K\left({{i^s}, {j^s}} \right) = \sum\limits_{{k^s}:\left({{i^s}, {s^s}} \right) \in {\mathit{\boldsymbol{\omega }}_k}s} {\left({1 + \frac{{\left({{I_{{i^s}}} - {\mu _{{k^s}}}} \right)\left({{I_{{j^s}}} - {\mu _{{k^s}}}} \right)}}{{\sigma _{{k^s}}^2 + \varepsilon G\left({{i^s}} \right)}}} \right)} $

(14)

式中，${G\left({{i^s}} \right)}$为尺度$s$下得到的边缘权重值。最终的多尺度下聚合代价可表示为

$ \mathit{\boldsymbol{\hat C}}(i, d) = \sum\limits_{s = 0}^s {{\mathit{\boldsymbol{A}}^{ - 1}}} (0, s){{\mathit{\boldsymbol{\tilde C}}}^0}({i^0}, {d^0}) $

(15)

式中，${\boldsymbol{A}}$为式(13)求解过程中的3对角常量系数矩阵，${{\mathit{\boldsymbol{\tilde C}}}^0}({i^0}, {d^0})$为第0层的代价量。

1.3 视差计算

对于代价聚合后得到的最终匹配代价$\mathit{\boldsymbol{\hat C}}(i, d)$, 利用WTA策略在视差范围内搜索累积代价最优的点作为对应的匹配点，取其对应的视差作为初始视差值，即

$ {d_i} = {\rm{arg}}\mathop {{\rm{min}}}\limits_{d \in \mathit{\boldsymbol{D}}} [\mathit{\boldsymbol{\hat C}}(i, d)] $

(16)

式中，$\mathit{\boldsymbol{D}}$表示视差范围。

1.4 视差优化

以左右图像分别作为参考图像，经视差计算后，可以得到左右两幅视差图${{\boldsymbol{d}}_{\rm{L}}}$和${{\boldsymbol{d}}_{\rm{R}}}$，但是所得视差图存在诸多误匹配点。对此，本文采用左右一致性检测、空洞赋值和加权中值滤波的方法对初始视差图进行后处理。首先，对两幅视差图做左右一致性检测，具体为

$ |{\mathit{\boldsymbol{d}}_{\rm{L}}}(i) - {\mathit{\boldsymbol{d}}_{\rm{R}}}[i - {\mathit{\boldsymbol{d}}_{\rm{L}}}(i)]| < 1 $

(17)

式中，${\mathit{\boldsymbol{d}}_{\rm{L}}}(i)$为左视差图上点$i$处的视差值，d_R[i-d_L(i)]为$i$点在右视差图中对应点的视差值。若不满足式(17)，则$i$标记为遮挡点。对遮挡像素区域应赋予合理的视差值。对遮挡点$i$分别水平向左、向右搜寻两个方向上第1个非遮挡点，记为i_L和i_R，并将点$i$的视差值赋为i_L和i_R中视差值较小的一个，即d_i=min(d(i_L, i_R))。随后对视差图做加权中值滤波以平滑误匹配视差，得到最终的视差图。

2 实验结果与分析

本文使用Middlebury立体匹配算法测试平台提供的测试集，验证所提代价计算与代价聚合算法的有效性，实验涉及的参数如表 1所示。

表 1 实验参数设置
Table 1 Parameters involved in experiments

下载CSV

参数	值
n_e	7
s_min	0.000 5
∂₁	0.11
∂₂	0.89
∂₃	0.001 5
τ_c	0.027 45
τ_g	0.007 84
n	7

经实验得出，n_e取值为7时，配合∂₃可以构建大小合适的权重代价值以影响改进前的代价量，且n取7时能够较好地利用像素点的空间信息以获得良好的匹配精度。

式(4)与式(5)中的截断阈值与平衡两种算法权重的系数取自Hosni等人(2013b)的研究成果，本文对相关数值进行了微调，但保证了对比实验中变量的一致。

实验环境为Intel(R) Core(TM) i5-7500CPU，16 GB内存。为了更直观展示所提算法的真实性能，若非特殊说明，本文进行实验对比的视差图均未进行视差优化。计算误匹配率的视差阈值设定为1，意为所得视差图结果与真实视差图相差1个像素以上时，判定此点为误匹配点。

2.1 匹配代价计算验证

为验证本文代价计算算法的有效性，与3种匹配代价方法GRD(AD+Gradient)(Hosni等，2013b)、CG(AD+Census+Gradient)(Zhu和Yan，2017)和Cen(Census变换)(Zabih和Woodfill，1994)进行对比，对Middlebury2.0数据集中的4组标准图像tsukuba、venus、teddy与cones进行实验。实验均采用相同的代价聚合方法，取在非遮挡区域(nonocc)、所有区域(all)与视差不连续区域(disc)的误匹配率进行对比，结果如表 2所示。其中，平均值为4幅图像在3个区域内误匹配率的均值，结果未经视差优化。从表 2可以看出，所提代价计算算法未经过视差优化的平均误匹配率为6.62%，在tsukuba、venus、teddy和cones图像上非遮挡区域、所有区域与视差不连续区域的误匹配率均低于其他代价计算算法。图 4为GRD、CG、Cen与所提算法在venus与teddy图像上生成的视差图，图中红色像素点为标注的误匹配点。实验结果表明，本文算法所得视差图在venus和teddy图像中道具边缘纹理处的误匹配标记点较其他算法数量更少、分布更加收敛。融合边缘权重代价后的实验结果更清晰地反映了像素的真实视差。

表 2 不同代价计算算法在不同区域的误匹配率
Table 2 Error matching rates of different cost computation algorithms in different regions

下载CSV

/%
算法	tsukuba			venus			teddy			cones			平均值
算法	nonocc	all	disc	nonocc	all	disc	nonocc	all	disc	nonocc	all	disc	平均值
GRD	2.31	2.93	9.35	1.31	2.29	12.12	6.99	14.90	17.97	3.13	11.10	8.76	7.76
CG	2.69	3.61	11.01	2.35	3.72	22.86	8.36	17.39	23.37	3.84	13.11	10.94	10.27
Cen	3.32	3.86	14.11	1.90	3.43	18.74	8.57	17.55	24.12	4.94	14.29	13.91	10.73
本文	2.13	2.62	8.90	0.62	1.33	7.03	6.18	12.88	16.47	3.11	9.68	8.46	6.62
注：加粗字体为每列最优结果。

图 4 不同的匹配代价算法获得的误匹配标记图

Fig. 4 Error maps of different cost computation algorithms ((a) ground truth; (b) GRD; (c) CG; (d) Cen; (e) ours)

2.2 匹配代价聚合验证

为验证所提匹配代价聚合算法的有效性，在Middlebury2006立体匹配数据集中的21组图像上，与引导滤波(guided filter，GF)(Hosni等，2013b)、双边滤波(bilateral filter，BF)(Yoon和Kweon，2006)、最小生成树(MST)(Yang，2012)和分割树(segment-tree，ST)(Mei等，2013)等4种经典代价聚合算法进行对比实验，使用非遮挡区域(nonocc)的误匹配率评估算法性能，结果如表 3所示。其中，CSGF(cross-scale guided filter)为跨尺度(Zhang等，2014)下的引导滤波，GF(single)为单尺度下的改进结果，平均值为21组图像对上误匹配率的均值，结果未经视差优化。从表 3中可以看出，本文算法在21组立体图像对上平均误匹配率为9.18%，优于所列对比算法。单尺度与跨尺度下改进后的GF在baby1、baby2、baby3、bowling1、bowling2、flowerpots、monopoly和wood1图像上对不同区域施以不同的滤波平滑倍数，可以更有效地影响像素点的代价量，在非遮挡处的误匹配率较改进前均有明显提升。在弱纹理图像midd1与midd2上，边缘权重值对引导滤波正则化项的影响难以使图中过于相似的误匹配像素代价值产生足够的区分，非局部聚合算法MST、ST在此有较好的表现，优于所提算法。

表 3 不同代价聚合算法在非遮挡区域的误匹配率
Table 3 Error matching rates of different cost aggregation algorithms in no-occluded region

下载CSV

/%
图像	算法
图像	GF	BF	MST	ST	CSGF	GF (single)	本文
aloe	5.38	7.45	5.4	5.63	5.39	5.31	5.32
baby1	4.69	5.3	9.08	5.46	4.07	3.39	3.37
baby2	4.56	4.82	14.47	16.07	3.3	3.58	2.93
baby3	4.42	4.93	6.06	4.66	4.07	3.69	3.68
bowling1	12.48	14.18	20.12	19.64	10.21	12.27	10.06
bowling2	5.61	7.35	10.64	11.39	5.32	5.23	5.06
cloth1	1.15	3.09	1.02	1.22	1.19	0.99	1.13
cloth2	3.51	6.44	4.69	4.79	3.43	3.48	3.42
cloth3	2.22	3.78	3.4	3.69	2.24	2.19	2.2
cloth4	1.55	3.03	2.47	2.54	1.59	1.47	1.59
flowerpots	9.47	10.63	17.27	12.92	8.71	9.24	8.64
lampshade1	10.3	10.61	10.4	9.63	7.8	10.19	7.73
lampshade2	17.53	18.38	13.98	11.22	14.21	17.24	14.2
midd1	37.89	38.57	24.84	24.48	34.99	37.89	35.04
midd2	34.03	34.75	21.9	21.1	29.3	33.98	29.13
monopoly	26.35	27.69	22.97	22.42	24.12	24.61	23.07
plastic	34.03	33.35	45.37	38.96	27.09	33.3	26.35
rocks1	3.55	6.2	3.66	3.54	3.25	3.47	3.18
rocks2	1.57	3.41	2.33	2.2	1.55	1.5	1.5
wood1	4.06	7.2	11.03	5.72	3.71	3.73	3.52
wood2	1.84	2.58	3.89	5.8	1.79	1.8	1.79
平均值	10.77	12.08	12.12	11.1	9.4	10.41	9.18
注：加粗字体为每行最优结果。

图 5为Middlebury2006数据集中baby1和baby3两幅图像的实验结果，可以看出，本文算法在视差不连续等边缘纹理处的误匹配点明显减少，且融合了不同尺度的图像信息，提升了弱纹理的平滑区域的匹配精度。

图 5 不同代价聚合算法获得的误匹配图

Fig. 5 Error maps of different cost aggregation algorithms

((a)GF; (b) BF; (c)MST; (d)ST; (e)CSGF; (f)GF(single); (g)ours)

2.3 算法对比

为测试本文算法的总体性能，在Middlebury2006数据集中的21组图像对上进行测试，使用非遮挡区域和所有区域的误匹配率及峰值信噪比(peak signal-to-noise ratio，PSNR)来比较算法的性能。为真实反映本文算法的效果，与同样在跨尺度代价聚合框架下的GRD-CSGF(cross-scale GF)、Cen-CSST(cross-scale ST)、CG-CSMST(cross-scale MST)和GRD-CSBF (cross-scale BF)等4种算法进行对比。实验结果如表 4—表 6所示。其中，平均值为21组图像对上实验结果的均值，所得结果未经视差优化，视差阈值为1。

表 4 不同立体匹配算法在非遮挡区域的误匹配率
Table 4 Error matching rates of different stereo matching algorithms in no-occluded region

下载CSV

/%
图像	算法
图像	GRD-CSGF	Cen-CSST	CG-CSMST	GRD-CSBF	本文
aloe	5.39	6.59	4.77	7.15	5.22
baby1	4.07	4.80	7.72	4.61	2.96
baby2	3.30	14.41	14.58	3.77	2.40
baby3	4.07	5.70	6.83	4.51	3.03
bowling1	10.21	17.97	17.68	12.37	7.53
bowling2	5.32	14.06	11.33	6.95	4.56
cloth1	1.19	1.16	0.75	3.13	1.06
cloth2	3.43	5.34	4.18	6.47	3.17
cloth3	2.24	3.06	2.16	3.67	2.25
cloth4	1.59	2.18	1.70	2.85	1.85
flowerpots	8.71	14.21	17.51	9.84	8.15
lampshade1	7.80	10.10	11.9	8.8	7.28
lampshade2	14.21	11.50	12.58	14.99	13.89
midd1	34.99	19.39	25.42	34.3	34.51
midd2	29.3	15.96	21.64	29.15	28.88
monopoly	24.12	16.30	21.13	24.78	20.94
plastic	27.09	19.95	35.0	26.22	23.86
rocks1	3.25	3.93	3.63	6.02	2.56
rocks2	1.55	3.14	2.87	3.27	1.51
wood1	3.71	6.33	11.49	6.67	3.20
wood2	1.79	6.33	4.69	2.38	0.84
平均值	9.40	9.64	11.41	10.57	8.55
注：加粗字体为每行最优值。

表 5 不同立体匹配算法在所有区域的误匹配率
Table 5 Error matching rates of different stereo matching algorithms in all region

下载CSV

/%
图像	算法
图像	GRD-CSGF	Cen-CSST	CG-CSMST	GRD-CSBF	本文
aloe	12.15	17.97	15.56	13.56	11.25
baby1	11.19	12.49	15.29	10.97	7.14
baby2	10.25	23.31	22.86	10.31	7.15
baby3	16.5	19.08	19.53	16.85	11.51
bowling1	23.04	32.03	30.52	24.59	18.2
bowling2	17.75	25.91	24.54	19.24	15.24
cloth1	10.22	11	10.39	12.4	5.45
cloth2	16.26	18.89	17.86	19.61	14.09
cloth3	10.87	13.1	11.55	12.5	6.9
cloth4	14.97	16.96	16.53	16.25	13.41
flowerpots	20.45	26.94	29.54	21.28	16.47
lampshade1	24.23	20.39	23.47	20.85	20.59
lampshade2	26.31	23	25.76	26.7	22.89
midd1	41.79	28.16	33.15	41.04	40.73
midd2	36.33	24.43	29.52	36.27	35.55
monopoly	31.7	24.46	29.06	32.27	28.35
plastic	38.69	33.12	46.19	37.37	33.17
rocks1	11.98	13.57	13.13	14.51	8.01
rocks2	11.63	13.95	13.43	13.3	6.01
wood1	16.62	17.28	24.28	19.92	13.16
wood2	14.27	19.33	16.2	14.78	8.9
平均值	19.87	20.73	22.3	20.69	16.39
注：加粗字体为每行最优值。

表 6 不同立体匹配算法的峰值信噪比
Table 6 PSNR of different stereo matching algorithms

下载CSV

/dB
图像	算法
图像	GRD-CSGF	Cen-CSST	CG-CSMST	GRD-CSBF	本文
aloe	20.66	18.91	19.33	20.71	20.37
baby1	23.9	23.96	23.74	24.11	26.48
baby2	24.44	24.22	25.1	24.54	24.43
baby3	15.19	13.95	14.37	15.06	18.82
bowling1	14.09	14.23	13.94	14.16	15.32
bowling2	14.72	13.92	14.98	14.6	17.56
cloth1	18.35	17.46	17.82	18.84	37.79
cloth2	16.33	16	16.44	16.37	20.89
cloth3	20.27	18.58	19.83	20.29	25.61
cloth4	15.55	14.66	14.85	15.33	22.66
flowerpots	13.61	12.89	13.58	13.59	15.38
lampshade1	17.31	17	16.85	17.17	18.8
lampshade2	15.29	16.12	16.47	15.14	17.29
midd1	16.27	16.37	16.25	16.42	17.23
midd2	16.99	18.88	18.73	17.08	17.4
monopoly	15.93	19.07	17.96	15.89	16.09
plastic	13.44	16.16	14.55	13.47	15.57
rocks1	17.17	16.44	16.57	17.08	21.56
rocks2	16.37	15.39	16.12	16.42	20.69
wood1	15.56	16.04	14.88	15.56	17.15
wood2	13.7	13.48	12.76	13.46	22.96
平均值	16.91	16.85	16.91	16.92	20.48
注：加粗字体为每行最优值。

在21组立体图像对中，所提算法在非遮挡区域和所有区域的平均误匹配率为8.55%和16.39%，误匹配率最低，像素的视差偏离程度较真实视差图最小，匹配精度最优。与同样在跨尺度框架下的其他算法相比，平均峰值信噪比提升了3.56~3.63 dB。本文算法在baby1、cloth1、cloth3和rocks2上的所有区域误匹配率分别为7.14%、5.45%、6.90%和6.01%，较其他算法明显降低；在baby3、bowling1与wood2上非遮挡区域的误匹配率较改进前算法的4.07%、10.21%和1.79%分别降低了1.04%、2.68%和0.95%，证明了所提算法的有效性。

图 6是baby2、baby3、bowling1、rocks2和wood2等5组图像在不同算法下获取的视差误匹配标记图。图 6(g)是为直观呈现匹配效果，使用Middlebury Stereo平台提供的cvkit工具依据像素强度对所提算法视差图的渲染结果。从图 6可以看出，在同样跨尺度的框架下，本文算法在边缘纹理处的红色标记最少，误匹配率最低。图 6(g)中由冷色调到暖色调的渐变表示物体在场景中的位置由深到浅，所提算法得出的视差图较为真实地反映了物体的深度信息。

图 6 跨尺度下不同匹配算法所得误匹配标记图

Fig. 6 Error maps of different stereo matching algorithms under cross-scale framework

((a)left images; (b)GRD-CSGF; (c)Cen-CSST; (d)CG-CSMST; (e)GRD-CSBF; (f)ours; (g)color renderings for depth information of the proposed algorithm)

为了进一步验证算法的总体性能，将本文算法与SMPF(stereo matching based on particle filters)(Ploumpis等，2015)、Ada_SGM(adaptive semi-global stereo matching)(黄超和赵华治，2019)、AdaptAggrDP(adaptive aggregation dynamic programming)(Wang等，2014a)、LEBP(local edge and belief propagation)(何栿和达飞鹏，2011)、LCVB-DEM(luminance colour viewpoint and border enhanced disparity energy model)(Martins等，2015)、RINCensus(related information of neighborhood census transform)(马利等，2014)和TwoStep(Wang等，2014b)等7种算法在Middlebury数据集中的tsukuba、venus、teddy和cones等4幅图像上进行实验，使用各算法在非遮挡区域(nonocc)、所有区域(all)和视差不连续区域(disc)的错误率来比较算法的性能，所得视差图皆经过视差优化处理，结果如表 7所示，表中平均值是4幅图像在3个区域内误匹配率的均值。可以看出，本文算法在venus图像上的非遮挡区域误匹配率仅为0.18%，在所有区域的误匹配率为0.39%。同时，所提算法在teddy和cones图像3个区域的误匹配率与对比算法相比皆为最低。在4幅图像的不同区域，本文算法的平均误匹配率为5.77%，取得了良好效果。图 7为本文算法在tsukuba、venus、teddy和cones图像上的实验结果。

表 7 不同立体匹配算法在不同区域的误匹配率
Table 7 Error matching rates of different stereo matching algorithms in different regions

下载CSV

/%
算法	tsukuba			venus			teddy			cones			平均值
算法	nonocc	all	disc	nonocc	all	disc	nonocc	all	disc	nonocc	all	disc	平均值
SMPF	0.98	1.53	5.31	0.25	0.69	2.60	9.93	14.5	22.6	6.51	13.1	14.8	7.73
Ada_SGM	10.30	10.54	14.24	4.09	4.23	23.46	15.68	19.88	16.62	14.13	18.31	15.24	13.89
AdaptAggrDP	1.57	3.50	8.27	1.53	2.69	12.4	6.79	14.30	16.2	5.53	13.2	14.8	8.40
LEBP	1.85	3.60	8.11	1.73	2.53	10.9	12.0	18.4	22.0	5.79	13.4	13.8	9.51
LCVB-DEM	4.49	5.23	21.3	1.32	1.67	11.5	9.99	16.3	26.1	6.56	13.6	18.2	11.40
RINCensus	4.78	6.00	14.4	1.11	1.76	7.91	9.76	17.3	26.1	8.09	16.2	17.6	10.90
TwoStep	2.91	3.68	13.3	0.27	0.45	2.63	7.42	12.6	18.0	4.09	10.1	10.3	7.14
本文	2.03	2.40	9.28	0.18	0.39	2.56	5.85	11.31	15.71	2.94	8.56	7.99	5.77
注：加粗字体为每列最优结果。

图 7 本文算法在Middlebury数据集中的标准4幅图上的实验结果

Fig. 7 Experimental results of proposed algorithm for 4 standard images on Middlebury dataset((a)left images; (b)ground truth; (c)results of the proposed algorithm; (d)error maps of the proposed algorithm; (e)color renderings for depth information of the proposed algorithm)

在算法的运行效率方面，表 4中的5种算法在Middlebury数据集中21组图像对上的平均运行时间如表 8所示。ST与MST原理类似，在代价聚合过程中对图像所有点进行层次性划分，极大减少了计算量，算法运行速度较快；引导滤波的计算复杂度与核大小无关，较双边滤波的效率大幅提升。所提算法与改进前相比，平均运行时间仅增加0.73 s，依然保持了良好的实时性。实验中部分参数对结果的影响如图 8所示。

表 8 不同算法在Middlebury数据集中21组图像对上的平均运行时间
Table 8 Average running time of 21 image pairs on Middlebury dataset with different algorithms

下载CSV

算法	平均运行时间/s
GRD-CSGF	17.01
Cen-CSST	1.16
CG-CSMST	2.12
GRD-CSBF	243.44
本文	17.74
注：加粗字体为最优结果。

图 8 不同参数实验结果

Fig. 8 Experimental results on different parameter settings

((a) impact of ε on error matching rate; (b) impact of S_max on error matching rate)

3 结论

针对传统立体匹配算法不能充分利用图像的边缘纹理信息、对边缘区域像素点的代价量不能有效区分的问题，本文利用图像边缘空间信息获得的权重矩阵，结合灰度差绝对值与梯度构建新的代价函数，并且在代价聚合阶段，使用边缘处的权值改变引导滤波的正则化项，以获取不同的平滑倍数，提出了融合边缘保持与改进代价聚合的立体匹配算法。在Middlebury数据集上进行测试，并将测试结果与其他主流算法的结果进行分析与对比。可知，本文算法扩大了边缘处的像素点代价量的区分度，在非遮挡与视差不连续区域的误匹配率明显降低，有效提高了图像的匹配精度，证明本文算法具有良好的边缘保持特性。与其他算法相比，本文算法充分利用了边缘空间信息，但是针对图像不同的边缘纹理特质，如何提取更多的像素差异化特征，并引入自适应思想是下一步的研究重点。

参考文献

Felzenszwalb P F and Huttenlocher D R. 2004. Efficient belief propagation for early vision//Proceedings of 2014 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE: 261-268[DOI:10.1109/CVPR.2004.1315041]

He F, Da F P. 2011. Stereo matching using belief propagation and local edge construction-based cost aggregation. Journal of Image and Graphics, 16(11): 2060-2066 (何栿, 达飞鹏. 2011. 置信度传播和区域边缘构建的立体匹配算法. 中国图象图形学报, 16(11): 2060-2066) [DOI:10.11834/jig.20111116]

He K M, Sun J, Tang X O. 2013. Guided image filtering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(6): 1397-1409 [DOI:10.1109/TPAMI.2012.213]

Hosni A, Bleyer M, Gelautz M. 2013a. Secrets of adaptive support weight techniques for local stereo matching. Computer Vision and Image Understanding, 117(6): 620-632 [DOI:10.1016/j.cviu.2013.01.007]

Hosni A, Rhemann C, Bleyer M, Rother C, Gelautz M. 2013b. Fast cost-volume filtering for visual correspondence and beyond. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(2): 504-511 [DOI:10.1109/TPAMI.2012.156]

Howard A. 2008. Real-time stereo visual odometry for autonomous ground vehicles//Proceedings of 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems. Nice, France: IEEE: 3946-3952[DOI:10.1109/IROS.2008.4651147]

Huang C, Zhao H Z. 2019. Semi-global stereo matching with adaptive window based on grayscale value. Journal of Image and Graphics, 24(8): 1381-1390 (黄超, 赵华治. 2019. 根据灰度值信息自适应窗口的半全局匹配. 中国图象图形学报, 24(8): 1381-1390) [DOI:10.11834/jig.180574]

Luo G E. 2012. Some Issues of Depth Perception and Three Dimension Reconstruction from Binocular Stereo Vision. Changsha: Central South University (罗桂娥. 2012. 双目立体视觉深度感知与3维重建若干问题研究. 长沙: 中南大学)

Ma L, Li J J and Ma J. 2014. Modified Census transform with related information of neighborhood for stereo matching algorithm. Computer Engineering and Applications, 50(24): 16-20, 46 (马利, 李晶皎, 马技.邻域相关信息的改进Census变换立体匹配算法.计算机工程与应用, 50(24): 16-20, 46)[DOI:10.3778/j.issn.1002-8331.1405-0081])

Martins J A, Rodrigues J M F, Du B H. 2015. Luminance, colour, viewpoint and border enhanced disparity energy model. PLoS One, 10(6): #e0129908 [DOI:10.1371/journal.pone.0129908]

Mei X, Sun X, Dong W M, Wang H T and Zhang X P. 2013. Segment-tree based cost aggregation for stereo matching//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA: IEEE: 313-320[DOI:10.1109/CVPR.2013.47]

Ploumpis S, Amanatiadis A, Gasteratos A. 2015. A stereo matching approach based on particle filters and scattered control landmarks. Image and Vision Computing, 38: 13-23 [DOI:10.1016/j.imavis.2015.04.001]

Scharstein D, Szeliski R. 2002. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision, 47(1/3): 7-42 [DOI:10.1023/A:1014573219977]

Van Meerbergen G, Vergauwen M, Pollefeys M, Van Gool L. 2002. A hierarchical symmetric stereo algorithm using dynamic programming. International Journal of Computer Vision, 47(1/3): 275-285 [DOI:10.1023/A:1014562312225]

Wang L, Yang R G, Gong M L, Liao M. 2014a. Real-time stereo using approximated joint bilateral filtering and dynamic programming. Journal of Real-Time Image Processing, 9(3): 447-461 [DOI:10.1007/s11554-012-0275-4]

Wang L Q, Liu Z, Zhang Z H. 2014b. Feature based stereo matching using two-step expansion. Mathematical Problems in Engineering, 2014: #452803 [DOI:10.1155/2014/452803]

Xie W, Zhou Y Q, You M. 2016. Improved guided image filtering integrated with gradient information. Journal of Image and Graphics, 21(9): 1119-1126 (谢伟, 周玉钦, 游敏. 2016. 融合梯度信息的改进引导滤波. 中国图象图形学报, 21(9): 1119-1126) [DOI:10.11834/jig.20160901]

Yang Q X. 2012. A non-local cost aggregation method for stereo matching//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA: 1402-1409[DOI:10.1109/CVPR.2012.6247827]

Yoon K J, Kweon I S. 2006. Adaptive support-weight approach for correspondence search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(4): 650-656 [DOI:10.1109/TPAMI.2006.70]

Zabih R and Woodfill J. 1994. Non-parametric local transforms for computing visual correspondence//Proceedings of the 3rd European Conference on Computer Vision. Stockholm, Sweden: 151-158[DOI:10.1007/BFb0028345]

Zhang K, Fang Y Q, Min D B, Sun L F, Yang S Q, Yan S Q and Tian Q. 2014. Cross-scale cost aggregation for stereo matching//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE: 1590-1597[DOI:10.1109/CVPR.2014.206]

Zhu S P, Yan L N. 2017. Local stereo matching algorithm with efficient matching cost and adaptive guided image filter. The Visual Computer, 33(9): 1087-1102 [DOI:10.1007/s00371-016-1264-6]