|
发布时间: 2018-02-16 |
图像分析和识别 |
|
|
收稿日期: 2017-07-18; 修回日期: 2017-10-27
基金项目: 国家自然科学基金项目(61162023);航空科学基金资助项目(2016ZC56005);江西省重点研发计划一般项目(20171BBG70052,20161BBE50080)
第一作者简介:
江少锋(1978-), 男, 副教授, 2008年于南方医科大学获生物医学工程专业博士学位, 主要研究方向为医学图像处理。E-mail:jsphone@163.com.
中图法分类号: TP301
文献标识码: A
文章编号: 1006-8961(2018)02-0174-08
|
摘要
目的 符号距离函数在水平集图像分割,视觉特征提取等图像处理领域有重要应用。随着图像分辨率越来越高,符号距离函数计算效率直接影响图像处理速度,为实现高分辨率图像实时处理,本文在降维法的基础上提出了并行算法,并针对并行计算对降维法进行了改进。方法 降维法将2维距离计算转化为两个1维距离计算,并采用抛物线下界法计算1维距离,是当前最快的一种符号距离计算方法。首先利用行和列计算的独立性,提出了降维法的并行算法。然后再对并行降维法进行改进,提出了抛物线下界法的并行算法。该方法采用多线程分段并行计算抛物线下界,即每个像素点与段内相邻像素点并行进行抛物线求交运算,快速搜索抛物线下界,从而实现了抛物线下界法的分段并行距离函数计算。所有并行算法在CUDA平台上采用GPU通用并行计算方法实现。结果 对不同分辨率及包含不同曲线的9幅图像进行实验测试,在距离计算误差小于1的条件下,并行降维算法对所有测试图像计算时间均小于0.06 s,计算效率比串行方法有了10倍以上的提升,改进并行降维算法对所有测试图像计算时间均小于0.03 s,计算效率比串行方法有了20倍左右的提升。结论 该方法实现了符号距离函数的快速并行计算,其优势在于当图像分辨率较高时仍然能够实现实时处理。
关键词
符号距离函数; 并行计算; 降维法; 抛物线下界法; 水平集
Abstract
Objective Signed distance functions are the nearest distances between pixels and points on the closed curve in an image, with a negative sign in the curve and a positive sign outside the curve. The signed distance function has important applications in image processing, such as level set-based segmentation, 3D visual feature extraction, and pattern recognition in computer vision. The computational complexity of the signed distance function is O(N×M), where N is the number of pixels in an image, and M is the number of points on a closed curve. The high computational complexity of the signed distance function directly affects the computational efficiency of image processing with the increase in image resolution. For real-time processing of an image with high resolution, an improved real-time computing method for the signed distance function based on the dimension reduction method was proposed to improve the computational efficiency. Method Dimension reduction method transforms the 2D signed distance function into two independent 1D signed distance functions for each row (or column) of the image and uses lower parabola envelope-based method for calculating the 1D distance. The low-er parabola envelope-based method sequentially computes the lower envelope of the first q parabolas, where the parabolas are ordered according to the horizontal locations of their vertices. The computational complexity of the dimension reduction method is O(2N) and is one of the fastest methods for calculating the signed distance function. This paper first proposes a parallel dimension reduction method according to the computational independence of the signed distance function among the rows (or columns) in an image to reduce the computational time of the dimension reduction method. The parallel dimension reduction method calculates the signed distance functions of the different rows (or columns) in an image simultaneously by allowing each thread to correspond to a row (or column) in the image. Thus, the computational complexity of the proposed parallel dimension reduction method is reduced to O(2W+2H), where W and H are the width and height of the image, respectively. Second, this paper proposes an improved parallel dimension reduction method by running the lower parabola envelope-based method in a parallel manner to improve the computational efficiency further. The improved parallel dimension reduction method uses multi-threads in calculating the lower parabola envelope in different segments to perform the dimension reduction method by finding the intersection points between two neighboring parabolas in a segment simultaneously. All parallel processing steps were completed on CUDA platform for general parallel computing on GPU. The first step is calculating the sign by assigning H threads, and each thread should correspond to a row in the image. The second step is calculating the 1D distance by assigning W×H threads. Each thread should correspond to a pixel in the image and should scan from left to right of the image to touch the closed curve and set the scanning distance as the 1D distance of each pixel. The last step is calculating the 2D distance by assigning W×H threads. Each thread should correspond to a pixel in the image and should scan from top to bottom of the image to obtain the final distance using the proposed parallel lower parabola envelope-based method. The entire computational complexity of distance in this method is O(2W+kS), where k is the iterative times, and S is the length of the segment. Result Nine images with different image sizes (256×256, 1 280×1 280, and 2 560×2 560) and curve shapes were tested in our experiments. The computational time of three generating methods for signed distance function (the regular serial, the proposed parallel, and the improved parallel dimension reduction methods) was compared with the case in which the maximal error was below 1. The computational time of the parallel method was less than 0.06 s for all testing images and more than 10 times faster than that of the regular serial dimension reduction method. The computational time of the improved parallel method was less than 0.03 s for all testing images and approximately 20 times faster than that of the regular serial dimension reduction method. Conclusion The proposed parallel method for the signed distance function can generate the signed distance in tens of milliseconds. Thus, the proposed parallel method is sufficiently fast for real-time image processing, especially for high-resolution images.
Key words
signed distance function; parallel computing; dimension reduction method; lower parabola envelope based method; level set
0 引言
图像分割是图像处理领域中的一个关键技术,主要用于目标识别和理解,在基于图像的人工智能应用领域中起到非常重要的作用。Osher和Sethian[1-2]提出的基于水平集的分割方法,由于能够很好地处理拓扑结构变化问题,近年来一直是图像分割领域中的研究热点。一般在水平集初始化及更新过程中,都需要计算符号距离函数(SDF)。所谓符号距离函数是指图像中的点到封闭曲线上点的最近距离。符号距离函数不仅用于水平集分割还广泛应用于计算机视觉中的3维视觉特征提取[3],模式识别[4]等领域。符号距离函数的计算量较大(尤其对于高分辨率图像),时间复杂度为
光栅扫描法[5]是早期的一种快速计算符号距离函数的方法,该方法用已经计算得到的距离去更新其特定邻域点的距离值。当采用先进先出的队列方式来计算时,理论上其计算复杂度为
在降维法中图像每行和每列的符号距离计算是独立的,故本文提出了并行的降维法计算符号距离,计算复杂度为
1 降维法计算符号距离函数
1.1 符号距离函数
符号距离函数定义为
$ u\left( {x, y} \right) = \left\{ \begin{array}{l} - d\left[{\left( {x, y} \right), C} \right]\;\;\;\;在曲线C内\\ 0\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;在曲线C上\\ d\left[{\left( {x, y} \right), C} \right]\;\;\;\;\;\;\;在曲线C外 \end{array} \right. $ | (1) |
式中,
1.2 符号计算
采用张博等人[12]提出的扫描方法构造符号表,具体步骤如下:
1) 构建曲线点形状,共分为图 1中的4种类型;
2) 从图像中的每一行的第一个像素点(可以认为该点在曲线之外)沿该行从左向右进行扫描时, 如果遇到上下型、下上型两类边界时, 那么在该点处发生内外点的转换, 而遇到下下型、上上型两类边界时, 则不发生转换。当所有行都扫描完毕,内外点计算完成。
该方法中步骤1)可以用并行方法来完成,计算复杂度为O(1),步骤2)中每行扫描也可以并行计算完成,复杂度为
1.3 降维法
由式(1)定义的符号距离函数可以转变成坐标网格的优化问题[10],即
$ \begin{array}{l} \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;{D_f}\left( {x, y} \right) = \\ \mathop {\min }\limits_{x', y'} \left( {{{\left( {x-x'} \right)}^2} + {{\left( {y-y'} \right)}^2} + f\left( {x', y'} \right)} \right) \end{array} $ | (2) |
即对每个坐标点,在曲线上找到相应的
$ \begin{array}{l} {D_f}\left( {x, y} \right) = \mathop {\min }\limits_{y'} \left( {{{\left( {y-y'} \right)}^2} + \mathop {\min }\limits_{x'} \left( {{{\left( {x-x'} \right)}^2} + } \right.} \right.\\ \left. {\left. {\;\;\;f\left( {x', y'} \right)} \right)} \right) = \mathop {\min }\limits_{y'} \left( {{{\left( {y-y'} \right)}^2} + {D_{f|y'}}\left( x \right)} \right) \end{array} $ | (3) |
$ {D_{f|y'}}\left( x \right) = \mathop {\min }\limits_{x'} \left( {{{\left( {x-x'} \right)}^2} + f\left( {x', y'} \right)} \right) $ | (4) |
式中,
从图 3中可以看出, 抛物线0、2,4构成这组抛物线的下界,抛物线1和3被删除,而这个下界在每个点上的长度(加粗线段)就是该点的距离。抛物线的下界可以由求抛物线的交点得到。假设
$ \begin{array}{l} {s_1} = \frac{{\left( {f\left( r \right) + {r^2}} \right)-\left( {f\left( q \right) + {q^2}} \right)}}{{2r-2q}}\\ {s_2} = \frac{{\left( {f\left( q \right) + {q^2}} \right)-\left( {f\left( z \right) + {z^2}} \right)}}{{2q - 2z}} \end{array} $ | (5) |
若交点
2 并行降维法
显然在降维法计算中,图像的每行和每列的计算是独立的,可以并行完成,假设图像的图像宽度为
1) 计算
2) 计算
该并行算法计算复杂度为
3 改进并行降维法
3.1 并行抛物线下界法
在抛物线下界方法中,最关键的步骤就是求交点和删除抛物线,在串行算法中,未删除抛物线按顺序进入队列,被删除的抛物线不进入队列,1维抛物线下界法的计算复杂度为
1) 向右计算阶段,并行计算每个点对应的抛物线
2) 向左计算阶段,并行计算每个点对应的抛物线
该方法相当于对抛物线的下界计算进行了分段处理,图 6中向左和向右共有
3.2 改进并行降维法实现
在CUDA平台上采用NVIDIA 950显卡进行通用GPU并行运算,为完成本文改进并行降维法,需将图像上每个像素点对应一个线程。由于像素点较多,需用多个线程块才能提供如此多的线程数。为此将整个图像作为一个网格,每个网格区域对应一个线程块,一个线程块中分配多个线程。如图 8,以一个大小为256×256像素的图像为例,在一个网格中分配32×8的线程块,也就是将图像分成32×8个块,每个线程块分配8×32个线程,这样一个线程就可以对应一个像素点的距离计算。具体计算步骤如下:
1) 计算符号。分配
2) 计算
3) 计算
4 实验及结果
本文的实验平台为联想台式计算机,Windows7 64位操作系统,intel i7-4790 CPU,主频3.6 GHz,8 GB内存,Nvidia GTX950显卡,算法均在CUDA3.2平台C语言实现。图 9(a)中的实验图像来自文献[12],大小为256×256像素,该图像具有较为复杂的拓扑结构,适合用来测试符号距离函数的计算。图 9(b)是采用本文并行方法计算得到的符号图,图 9(c)为距离图,与真实值误差为0,计算时间只有1.1 ms。可见本文所提的并行符号距离函数计算方法既快又准确。
为比较计算效率,分别采用了256×256像素,1 280×1 280像素,2 560×2 560像素3种大小图像,每种大小图像采用3种不同大小的封闭曲线进行实验。在计算最大距离误差小于1的情况下,对降维法(串行)、降维法(并行)和改进降维法(并行)进行了比较,计算结果如表 1所示。由于采用正方形图像进行测试,故图像高度
表 1
计算效率比较
Table 1
The comparison of computing efficiency
算法 | 图像宽度/像素 | 曲线点数 | 循环次数 | 计算用时/ms | 提速比 | 平均提速 |
降维法串 | 256 1 280 2 560 |
272/468/1 272 140/2 202/2 952 718/4 438/6 280 |
1 1 1 |
17/18/20 78/94/109 250/343/405 |
1 | 1 |
降维法并 | 256 1 280 2 560 |
272/468/1 272 140/2 202/2 952 718/4 438/6 280 |
1 1 1 |
1.1/1.2/1.6 4.5/10/8.4 12/41/54 |
15/15/12 17/9/13 20/8/8 |
13 |
改进降维法并 | 256 1 280 2 560 |
272/468/1 272 140/2 202/2 952 718/4 438/ 6 280 |
3 1/1/2 2/4/4 |
0.9/0.9/1.1 3.4/5.6/7.0 11/28/30 |
19/20/18 23/17/16 22/12/14 |
18 |
从表 1中可以看到,采用并行算法后,降维法(并行)计算效率平均提高了13倍左右,理论上提高倍数为
5 结论
本文提出了并行的降维法计算符号距离,在处理高分辨率图像时,其计算时间仅有数十毫秒,可见本文实现了符号距离函数的高效计算,假设在用水平集进行分割时须进行10次符号距离函数计算,那么本方法为此耗时少于1 s,而常规方法大大高于1 s,故本方法适用于2维、3维甚至4维图像的实时处理。虽然本文方法取得了很好的计算效率,但是需要少量迭代计算,且迭代次数与图像大小和曲线点数有关,这样对算法的适用性有一定影响。为此下一步工作将进一步研究如何将迭代次数减少到2甚至1即能满足精度要求,从而进一步提高计算效率和算法适应性。
参考文献
-
[1] Sethian J A. Level Set Methods and Fast Marching Methods:Evolving Interfaces in Computational Geometry, Fluid Mechanics, Computer Vision, and Materials Science[M]. London: Cambridge University Press, 1999.
-
[2] Osher S, Sethian J A. Fronts propagating with curvature-dependent speed:algorithms based on Hamilton-Jacobi formulation[J]. Journal of Computational Physics, 1988, 79(1): 12–49. [DOI:10.1016/0021-9991(88)90002-2]
-
[3] Sharma O, Agarwal N. Signed distance based 3D surface reconstruction from unorganized planar cross-sections[J]. Computers & Graphics, 2016, 62: 67–76. [DOI:10.1016/j.cag.2016.12.002]
-
[4] Guo D, Li C Q, Wu L, et al. Improved marching tetrahedra algorithm based on hierarchical signed distance field and multi-scale depth map fusion for 3D reconstruction[J]. Journal of Visual Communication and Image Representation, 2017, 48: 491–501. [DOI:10.1016/j.jvcir.2016.12.016]
-
[5] Fabbri R, Costa L D F, Torelli J C, et al. 2D Euclidean distance transform algorithms:a comparative survey[J]. ACM Computing Surveys, 2008, 40(1): #2. [DOI:10.1145/1322432.1322434]
-
[6] Tsai Y H R. Rapid and accurate computation of the distance function using grids[J]. Journal of Computational Physics, 2002, 178(1): 175–195. [DOI:10.1006/jcph.2002.7028]
-
[7] Li J, Yang X, Shi P F. A fast level set approach to image segmentation based on mumford-shah model[J]. Chinese Journal of Computers, 2002, 25(11): 1175–1183. [李俊, 杨新, 施鹏飞. 基于Mumford-Shah模型的快速水平集图像分割方法[J]. 计算机学报, 2002, 25(11): 1175–1183. ] [DOI:10.3321/j.issn:0254-4164.2002.11.008]
-
[8] Dong J W, Yang H Y, Zhang B. Fast generation of signed distance function in level set method[J]. Computer Engineering and Application, 2009, 45(34): 180–182. [董吉文, 杨海英, 张冰. 水平集方法中符号距离函数的快速生成[J]. 计算机工程与应用, 2009, 45(34): 180–182. ] [DOI:10.3778/j.issn.1002-8331.2009.34.056]
-
[9] Sethian J A. A fast marching level set method for monotonically advancing fronts[J]. Proceedings of the National Academy of Sciences of the United States of America, 1996, 93(4): 1591–1595. [DOI:10.1073/pnas.93.4.1591]
-
[10] Felzenszwalb P F, Huttenlocher D P. Distance transforms of sampled functions[J]. Theory of Computing, 2012, 8(19): 415–428. [DOI:10.4086/toc.2012.v008a019]
-
[11] Mishchenko Y. A fast algorithm for computation of discrete Euclidean distance transform in three or more dimensions on vector processing architectures[J]. Signal, Image and Video Processing, 2015, 9(1): 19–27. [DOI:10.1007/s11760-012-0419-9]
-
[12] Zhang B, Su Y L, Zhang S L. Method for generating the signed distance function based on curve marching[J]. Journal of Northwest University:Natural Science Edition, 2007, 37(5): 697–700. [张博, 苏永利, 张书玲. 基于曲线推进的符号距离函数生成方法[J]. 西北大学学报:自然科学版, 2007, 37(5): 697–700. ] [DOI:10.16152/j.cnki.xdxbzr.2007.05.033]
-
[13] Zhang B, Su Y L. A fast method for signed distance function generation[J]. Computer Applications and Software, 2008, 25(6): 102–103, 112. [张博, 苏永利. 一种快速的符号距离函数的生成方法[J]. 计算机应用与软件, 2008, 25(6): 102–103, 112. ] [DOI:10.3969/j.issn.1000-386X.2008.06.041]
-
[14] Feng C L, Zhao D Z, Huang M. Image segmentation using CUDA accelerated non-local means denoising and bias correction embedded fuzzy c-means (BCEFCM)[J]. Signal Processing, 2016, 122: 164–189. [DOI:10.1016/j.sigpro.2015.12.007]
-
[15] Fluck O, Vetter C, Wein W, et al. A survey of medical image registration on graphics hardware[J]. Computer Methods and Programs in Biomedicine, 2011, 104(3): e45–e57. [DOI:10.1016/j.cmpb.2010.10.009]