特征融合的双目半全局匹配算法及其并行加速实现
Semi-global stereo matching algorithm based on feature fusion and its CUDA implementation
- 2018年23卷第6期 页码:874-886
收稿:2017-11-10,
修回:2017-12-26,
纸质出版:2018-06-16
DOI: 10.11834/jig.170157
移动端阅览

浏览全部资源
扫码关注微信
收稿:2017-11-10,
修回:2017-12-26,
纸质出版:2018-06-16
移动端阅览
目的
2
在微小飞行器系统中,如何实时获取场景信息是实现自主避障及导航的关键问题。本文提出了一种融合中心平均Census特征与绝对误差(AD)特征、基于纹理优化的半全局立体匹配算法(ADCC-TSGM),并利用统一计算设备架构(CUDA)进行并行加速。
方法
2
使用沿极线方向的一维差分计算纹理信息,使用中心平均Census特征及AD特征进行代价计算,通过纹理优化的SGM算法聚合代价并获得初始视差图;然后,通过左右一致性检验检查剔除粗略视差图中的不稳定点和遮挡点,使用线性插值和中值滤波对视差图中的空洞进行填充;最后,利用GPU特性,对立体匹配中的代价计算、半全局匹配(SGM)计算、视差计算等步骤使用共享内存、单指令多数据流(SIMD)及混合流水线进行优化以提高运行速度。
结果
2
在Quarter Video Graphics Array(QVGA)分辨率的middlebury双目图像测试集中,本文提出的ADCC-TSGM算法总坏点率较Semi-Global Block Matching(SGBM)算法降低36.1%,较SGM算法降低28.3%;平均错误率较SGBM算法降低44.5%,较SGM算法降低49.9%。GPU加速实验基于NVIDIA Jetson TK1嵌入式计算平台,在双目匹配性能不变的情况下,通过使用CUDA并行加速,可获得117倍以上加速比,即使相较于已进行SIMD及多核并行优化的SGBM,运行时间也减少了85%。在QVGA分辨率下,GPU加速后的运行帧率可达31.8帧/s。
结论
2
本文算法及其CUDA加速可为嵌入式平台提供一种实时获取高质量深度信息的有效途径,可作为微小飞行器、小型机器人等设备进行环境感知、视觉定位、地图构建的基础步骤。
Objective
2
In unmanned aerial vehicle systems
estimation of scene information in real time is a key issue in conducting automatic obstacle avoidance and navigation. A binocular stereo vision system is an effective means to obtain scene information; this system simulates the working principle of the human eyes by using two cameras to capture the same sense at the same time and generates a disparity map by using a stereo matching algorithm. In this work
we propose ADCC-TSGM
a novel texture-optimized semi-global stereo matching algorithm based on the fusion of absolute difference (AD) feature and center average census feature. Efforts are made to speed up the algorithm through CUDA parallel acceleration.
Method
2
First
a one-dimensional difference method is used to calculate the texture information along the epipolar line
the center average census feature and AD feature are exploited to conduct the cost computation
and the global stereo matching algorithm is texture-optimized to aggregate the cost and obtain the initial disparity. Second
left-right consistency check is used to detect unstable pixels and occlusion pixels
and linear interpolation and median filter method are used to fill the holes of the disparity map. Lastly
to improve the running speed
we optimize the code of GPU acceleration for each step of the stereo matching. The time consumption of memory access is considered in the feature calculation of types
such that center average census is much higher than that of computation
and a large number of data-intensive computing tasks are conducted between adjacent threads. Consequently
we divide the dataset of the entire thread block into four regions
copy them into a shared memory
and use the shared memory for computation to reduce the overhead of memory accessing. A single thread can simultaneously handle two consecutive disparity calculations by using SIMD instructions. When the GPU is processing
the CPU is basically idle. Therefore
a hybrid pipeline is designed to fully utilize the computing resources of the embedded platform.
Result
2
To demonstrate the effectiveness of the proposed algorithm
we use NVIDIA Jetson TK1 developer kit
which has a quad-core ARM Cortex-A15 CPU
a Kepler GPU with 192 CUDA cores
and 2 GB memory
as the embedded computing platform to conduct experiments on Middlebury stereo datasets that have been resized to QVGA resolution. With the actual application scenarios and resolution of images
the maximum disparity for each algorithm is set to 64
and the block matching window size of SGBM and BM is set to 9×9. The texture penalty coefficients
ε
1 and
ε
2 in the proposed algorithm are set to 0.25 and 0.125
respectively. Experimental results show that the total bad-pixel rate and the average error rate of the proposed algorithm are significantly lower than those of BM
SGBM
and SGM
respectively. The total bad-pixel rate of the ADCC-TSGM algorithm is 73.9% lower than that of BM algorithm
36.1% lower than that of SGBM algorithm
and 28.3% lower than that of SGM algorithm. The average error rate of the proposed algorithm is 83.2% lower than that of the BM algorithm
44.5% lower than that of the SGBM algorithm
and 49.9% lower than that of the SGM algorithm. In particular
the use of center average census in feature matching can reduce the bad-pixel and error rates. The texture-based optimization can adaptively increase the penalty coefficient in low-texture regions and reduce the average error rate from 6.62 to 4.84. The post-processing method
including disparity consistency check and hole filling
can reduce the total bad-pixel rate from 14.46 to 7.12. Through GPU parallel acceleration
the CUDA implementation of the proposed algorithm becomes hundreds of times faster than that of pure CPU implementation without any loss in the quality of disparity map. Compared with SGBM
which has been optimized by using SIMD and multi-core parallel method
our proposed algorithm has a running time that is reduced by 85%. For QVGA resolution
the frame processing rate is as high as 31.8 FPS.
Conclusion
2
The proposed algorithm outperforms existing algorithms
such as BM
SGM
and SGBM
which have been used in industries. The CUDA-accelerated implementation of the proposed algorithm provides an effective and feasible method to obtain high-quality disparity information and can be used as a basic means of environmental perception
visual positioning
and map construction for real-time embedded applications
such as micro-aircraft systems.
Scharstein D, Szeliski R. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms[J]. International Journal of Computer Vision, 2002, 47(1-3):7-42.[DOI:10.1023/A:1014573219977]
Kim J, Kolmogorov V, Zabih R. Visual correspondence using energy minimization and mutual information[C]//Proceedings of 2003 Ninth IEEE International Conference on Computer Vision. Nice, France: IEEE, 2003: 1033-1040. [ DOI:10.1109/iccv.2003.1238463 http://dx.doi.org/10.1109/iccv.2003.1238463 ]
Muquit M A, Shibahara T, Aoki T. A high-accuracy passive 3D measurement system using phase-based image matching[J]. IEICE Transactions on fundamentals of Electronics, Communications and Computer Sciences, 2006, E89-A(3):686-697.[DOI:10.1093/ietfec/e89-a.3.686]
Roy S, Cox I J. A maximum-flow formulation of the N-camera stereo correspondence problem[C]//Proceedings of 1998 Sixth International Conference on Computer Vision. Bombay, India: IEEE, 1998: 492-499. [ DOI:10.1109/iccv.1998.710763 http://dx.doi.org/10.1109/iccv.1998.710763 ]
Zhang K, Lu J B, Lafruit G. Cross-based local stereo matching using orthogonal integral images[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2009, 19(7):1073-1079.[DOI:10.1109/tcsvt.2009.2020478]
Mei X, Sun X, Zhou M C, et al. On building an accurate stereo matching system on graphics hardware[C]//Proceedings of 2011 IEEE Inter national Conference on Computer Vision Workshops (ICCV Workshops). Barcelona, Spain: IEEE, 2011: 467-474. [ DOI:10.1109/iccvw.2011.6130280 http://dx.doi.org/10.1109/iccvw.2011.6130280 ]
Žbontar J, LeCun Y. Stereo matching by training a convolutional neural network to compare image patches[J]. Journal of Machine Learning Research, 2016, 17(65)1-32.
Li L C, Yu X, Zhang S L, et al. 3D cost aggregation with multiple minimum spanning trees for stereo matching[J]. Applied Optics, 2017, 56(12):3411-3420.[DOI:10.1364/AO.56.003411]
Hrabar S, Sukhatme G S, Corke P, et al. Combined optic-flow and stereo-based navigation of urban canyons for a UAV[C]//Proceedings of 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems. Edmonton, Canada: IEEE, 2005: 3309-3316. [ DOI:10.1109/iros.2005.1544998 http://dx.doi.org/10.1109/iros.2005.1544998 ]
Stefanik K V, Gassaway J C, Kochersberger K, et al. UAV-based stereo vision for rapid aerial terrain mapping[J]. GIScience&Remote Sensing, 2011, 48(1):24-49.[DOI:10.2747/1548-1603.48.1.24]
Zhou G, Fang L, Tang K T, et al. Guidance: A visual sensing platform for robotic applications[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Boston, USA: IEEE, 2015: 9-14. [ DOI:10.1109/cvprw.2015.7301360 http://dx.doi.org/10.1109/cvprw.2015.7301360 ]
Zhang Z. A flexible new technique for camera calibration[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(11):1330-1334.[DOI:10.1109/34.888718]
Hirschmuller H. Stereo processing by semiglobal matching and mutual information[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(2):328-341.[DOI:10.1109/tpami.2007.1166]
Nvidia C. Nvidia Cuda Compute Unified Device Architecture Programming Guide[M]. Santa Clara, USA:NVIDIA Corporation, 2007.
Nvidia C. CUDA C Programming guide[EB/OL]. 2017-10-22[2017-11-17] . http://docs.nvidia.com/cuda/cuda-c-programming-guide http://docs.nvidia.com/cuda/cuda-c-programming-guide .
The CUDA Memory Model. CUDA, Supercomputing for the Masses: Part 4[EB/OL]. 2008-06[2017-03-10] . http://www.drdobbs.com/parallel/cuda-supercomputing-for-the-masses-part/208401741?pgno=3 http://www.drdobbs.com/parallel/cuda-supercomputing-for-the-masses-part/208401741?pgno=3 .
NvidiaKepler Compute Architecture. Tuning CUDA applications for kepler[EB/OL]. 2017-10-22[2017-11-17] . http://docs.nvidia.com/cuda/kepler-tuning-guide http://docs.nvidia.com/cuda/kepler-tuning-guide .
Nvidia J. Bringing GPU-accelerated computing to embedded systems[EB/OL]. 2014-04[2017-03-10] . http://developer.download.nvidia.com/embedded/jetson/TK1/docs/Jetson_platform_brief_May2014.pdf http://developer.download.nvidia.com/embedded/jetson/TK1/docs/Jetson_platform_brief_May2014.pdf .
Scharstein D, Pal C. Learning conditional random fields for stereo[C]//Proceedings of 2007 IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, USA: IEEE, 2007: 1-8. [ DOI:10.1109/cvpr.2007.383191 http://dx.doi.org/10.1109/cvpr.2007.383191 ]
Hirschmuller H, Scharstein D. Evaluation of cost functions for stereo matching[C]//Proceedings of 2007 IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, USA: IEEE, 2007: 1-8. [ DOI:10.1109/cvpr.2007.383248 http://dx.doi.org/10.1109/cvpr.2007.383248 ]
相关作者
相关机构
京公网安备11010802024621