并行马尔可夫随机场计算变分光流方法
Parallel computing of variational optical flow based on the Markov random field model
- 2018年23卷第12期 页码:1852-1863
收稿:2018-06-04,
修回:2018-7-23,
纸质出版:2018-12-16
DOI: 10.11834/jig.180342
移动端阅览

浏览全部资源
扫码关注微信
收稿:2018-06-04,
修回:2018-7-23,
纸质出版:2018-12-16
移动端阅览
目的
2
基于马尔可夫随机场(MRF)的变分光流计算是一种较为鲁棒的光流计算方法,但是计算效率很低。置信传播算法(BP)是一种针对MRF较为高效的全局优化算法。本文提出一种MRF变分光流计算模型并采用并行BP方法实现,极大提高计算效率。
方法
2
提出的MRF变分光流计算模型中的数据项采用了Horn等人根据灰度守恒假设得到的光流基本约束方程,并采用非平方惩罚函数进行调整以平滑边界影响。为在CUDA平台上实现高效并行处理,本文提出了一种优化的基于置信传播的MRF并行光流计算方法。该优化方法在采用置信传播最小化MRF光流能量函数时,采用了一种4层的3维网络结构进行并行计算,每层对应MRF4邻域模型中的一个方向的信息传播,同时在每层中为每个像素分配多个线程采用并行降维法计算所要传递的信息,大大降低单线程计算负荷,大幅度提高计算效率。
结果
2
采用旋转小球图像序列进行实验,计算效率提高314倍;采用旋转小球、Yosemite山谷和RubberWhale 3种不同图像序列,与Horn算法、Weickert算法、Hossen并行Lucas算法、Grauer-Gray并行MRF算法进行对比实验,本文方法得到最低的平均端点误差(AEE),分别为0.13、0.55和0.34。
结论
2
本文提出了一种新的MRF光流计算模型,并在CUDA平台上实现了并行优化计算。实验结果表明,本文提出的并行计算方法在保持计算精度的同时极大提高了计算效率。本文方法对内存需求巨大,在处理高分辨率图像时,限制了采样点数,难以计算大位移。
Objective
2
The Markov random field (MRF) model for variational optical flow computing is a robust one. The MRF variational energy function includes the data and smooth terms. Brief propagation (BP) is a highly effective global method that minimizes the MRF energy function by iteratively transmitting messages from one pixel to the next in four directions. However
its computational cost remains high. The computational complexity of the BP method with a 3-level data pyramid is O(
$$ N$$
×
$$ L$$
2
×(10.5
$$T$$
+ 2))
where
$$ N$$
is the number of pixels in the image
$$ T$$
is the iterative time and
$$ L$$
is the width of spatial sampling. We proposed a new MRF model to calculate optical flow and a parallel method to minimize the MRF energy function
which can dramatically reduce computational time.
Method
2
The data term in the proposed MRF variational energy function for optical flow computing is derived from the optical flow constraint equation from the brightness constancy assumption proposed by Horn. The term is smoothed by a non-square penalty function to reduce the boundary effect. To realize parallel computing on the CUDA platform
we proposed an optimal parallel BP method to minimize the MRF energy function. This optimal method used a 3D grid
with four levels to perform the parallel method. At varying levels
the message was transmitted along different directions in the neighborhood. At each level
we assigned
$$ L$$
threads to one pixel and used the parallel dimension reduction method to calculate the message to be transmitted. Thus
we dramatically reduced the computational load of one thread and computational time. The dimension reduction method transforms the 2D energy function into two independent 1D energy functions and uses a low parabola envelope-based method to calculate the 1D energy function
whose computational complexity is O(4
$$L $$
2
). The parallel dimension reduction method simultaneously calculates the MRF energy functions of the different rows (or columns) in the label region of optical flow
whose size is
$$L $$
×
$$ L$$
by allowing each thread to correspond to a row (or column) in the region. Thus
the computational complexity of the proposed parallel dimension reduction method is reduced to O(4
$$ L$$
). All parallel processing steps were completed on the CUDA platform for general parallel computing on GPU. The first step is calculating the data term by assigning
$$ N$$
threads and letting each thread correspond to a pixel in the image. The corresponding computational complexity is O(
$$L $$
2
). The second step is the use of the proposed parallel dimension reduction method to calculate the message and transmit it to up
down
left
and right for
$$ T$$
times from the first to the third level in the data pyramid with three levels by assignin
g 4
$$L $$
threads to a pixel in the image. The corresponding computational complexity is O(3×4×
$$ L$$
×
$$T $$
). The last step is outputting the optical flow corresponding to the minimal message in each pixel by assigning
$$ N$$
threads and letting each thread correspond to a pixel in the image. The corresponding computational complexity is O(
$$L $$
2
). Thus
the entire computational complexity of this method is only O(2
$$ L$$
2
+ 12
$$ L$$
×
$$ T$$
) with three-level data pyramid.
Result
2
In the experiment for rotational sphere image sequence
the computational time was only 0.27 s
and corresponding computational time of the conventional method was 85 s. The computational efficiency of the proposed method was 314 times more than that of the conventional method. Our proposed method performed as rapidly as the Horn method
and faster than the Weickert (6 s) and Grauer-Gray (0.68 s) methods but slower than the Hossen method (0.01 s). In comparative experiments on errors with the Horn
Weickert
Hossen
and Grauer-Gray methods by using three image sequences (sphere image sequence with rotational moving
the Yosemite Valley image sequence with high change of brightness
and RubberWhale image sequence with many moving objects)
the proposed method achieved the lowest average endpoint error (AEE) (0.13
0.55
and 0.34
respectively) on all testing image sequences.
Conclusion
2
This paper proposed a novel parallel computing strategy of variational optical flow based on the MRF model by using a 3D grid with four levels to simultaneously transmit the message in the BP method along four directions
where multi-threads were assigned to one pixel for calculating the message to be transmitted. The results indicate that our proposed model has high computational efficiency and can obtain accurate optical flow from image sequences with rotational moving
high change of brightness and many moving objects. However
our proposed method requires a substantial amount of memory
which limits the width of spatial sampling. Therefore
our proposed method is unsuitable for calculating optical flows with large displacement from image sequences with high resolution.
Horn B K P, Schunck B G. Determining optical flow[J]. Artificial Intelligence, 1981, 17(1-3):185-203.[DOI:10.1016/0004-3702(81)90024-2]
Lucas B D, Kanade T. An iterative imge registration technique with an application to stereo vision[C ] //Proceedings of the 7 th International Joint Conference on Artificial Intelligence. Vancouvcr BC, Canada: Morgan Kaufmann Publishers Inc., 1981, 2: 674-679.
Zimmer H, Bruhn A, Weickert J. Optic flow in harmony[J]. International Journal of Computer Vision, 2011, 93(3):368-388.[DOI:10.1007/s11263-011-0422-6]
Papenberg N, Bruhn A, Brox T, et al. Highly accurate optic flow computation with theoretically justified warping[J]. International Journal of Computer Vision, 2006, 67(2):141-158.[DOI:10.1007/s11263-005-3960-y]
Heitz D, Mémin E, Schnörr C. Variational fluid flow measurements from image sequences:synopsis and perspectives[J]. Experiments in Fluids, 2010, 48(3):369-393.[DOI:10.1007/s00348-009-0778-3]
Werlberger M, Trobin W, Pock T, et al. Anisotropic Huber-L 1 optical flow[C ] //Proceedings of the British Machine Vision Conference. London, UK: BMVA Press, 2009: 1-11.[ DOI: 10.5244/c.23.108 http://dx.doi.org/10.5244/c.23.108 ] .
Zimmer H, Bruhn A, Weickert J. Optic flow in harmony[J]. International Journal of Computer Vision, 2011, 93(3):368-388.[DOI:10.1007/s11263-011-0422-6]
Felzenszwalb P F, Huttenlocher D P. Efficient belief propagation for early vision[J]. International Journal of Computer Vision, 2006, 70(1):41-54.[DOI:10.1007/s11263-006-7899-4]
Boykov Y, Veksler O, Zabih R. Fast approximate energy minimization via graph cuts[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(11):1222-1239.[DOI:10.1109/34.969114]
Weiss Y, Freeman W T. On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs[J]. IEEE Transactions on Information Theory, 2001, 47(2):736-744.[DOI:10.1109/18.910585]
Feng C L, Zhao D Z, Huang M. Image segmentation using CUDA accelerated non-local means denoising and bias correction embedded fuzzy c-means (BCEFCM)[J]. Signal Processing, 2016, 122:164-189.[DOI:10.1016/j.sigpro.2015.12.007]
Fluck O, Vetter C, Wein W, et al. A survey of medical image registration on graphics hardware[J]. Computer Methods and Programs in Biomedicine, 2011, 104(3):e45-e57.[DOI:10.1016/j.cmpb.2010.10.009]
Awan A A, Hamidouche K, Hashmi J M, et al. S-Caffe: co-designing MPI runtimes and Caffe for scalable deep learning on modern GPU clusters[C ] //Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. Austin, Texas, USA: ACM, 2017: 193-205.[ DOI: 10.1145/3018743.3018769 http://dx.doi.org/10.1145/3018743.3018769 ]
Cuomo S, Galletti A, Giunta G, et al. Toward a multi-level parallel framework on GPU cluster with PetSC-CUDA for PDE-based optical flow computation[J]. Procedia Computer Science, 2015, 51:170-179.[DOI:10.1016/j.procs.2015.05.220]
Hossen K, Mahmud H. Parallel optical flow detection using CUDA[D]. Dhaka, Bangladesh: BRAC University, 2014.
Grauer-Gray S, Kambhamettu C, Palaniappan K. GPU implementation of belief propagation using CUDA for cloud tracking and reconstruction[C ] //Proceedings of 2008 IAPR Workshop on Pattern Recognition in Remote Sensing. Tampa, Florida, USA: IEEE, 2008: 1-4.[ DOI: 10.1109/prrs.2008.4783169 http://dx.doi.org/10.1109/prrs.2008.4783169 ]
Felzenszwalb P F, Huttenlocher D P. Distance transforms of sampled functions[J]. Theory of Computing, 2012, 8(19):415-428.[DOI:10.4086/toc.2012.v008a019]
Jiang S F, Yang S H, Chen Z, et al. Parallel computing of signed distance function in level set based on dimension reduction[J]. Journal of Image and Graphics, 2018, 23(2):174-181.
江少锋, 杨素华, 陈震, 等.水平集中符号距离函数并行降维计算[J].中国图象图形学报, 2018, 23(2):174-181. [DOI:10.11834/jig.170348]
Barron J L, Fleet D J, Beauchemin S S. Performance of optical flow techniques[J]. International Journal of Computer Vision, 1994, 12(1):43-77.[DOI:10.1007/bf01420984]
相关作者
相关机构
京公网安备11010802024621