发布时间: 2018-02-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.170348
2018 | Volume 23 | Number 2

图像分析和识别

水平集中符号距离函数并行降维计算

江少锋¹, 杨素华², 陈震¹, 张聪炫¹, 周旭欣¹

1. 南昌航空大学测试与光电工程学院, 南昌 330063;

2. 南昌航空大学信息工程学院, 南昌 330063

收稿日期: 2017-07-18; 修回日期: 2017-10-27

基金项目: 国家自然科学基金项目（61162023）；航空科学基金资助项目（2016ZC56005）；江西省重点研发计划一般项目（20171BBG70052，20161BBE50080）

第一作者简介: 江少锋(1978-), 男, 副教授, 2008年于南方医科大学获生物医学工程专业博士学位, 主要研究方向为医学图像处理。E-mail:jsphone@163.com.

中图法分类号: TP301

文献标识码: A

文章编号: 1006-8961(2018)02-0174-08

摘要

目的符号距离函数在水平集图像分割，视觉特征提取等图像处理领域有重要应用。随着图像分辨率越来越高，符号距离函数计算效率直接影响图像处理速度，为实现高分辨率图像实时处理，本文在降维法的基础上提出了并行算法，并针对并行计算对降维法进行了改进。方法降维法将2维距离计算转化为两个1维距离计算，并采用抛物线下界法计算1维距离，是当前最快的一种符号距离计算方法。首先利用行和列计算的独立性，提出了降维法的并行算法。然后再对并行降维法进行改进，提出了抛物线下界法的并行算法。该方法采用多线程分段并行计算抛物线下界，即每个像素点与段内相邻像素点并行进行抛物线求交运算，快速搜索抛物线下界，从而实现了抛物线下界法的分段并行距离函数计算。所有并行算法在CUDA平台上采用GPU通用并行计算方法实现。结果对不同分辨率及包含不同曲线的9幅图像进行实验测试，在距离计算误差小于1的条件下，并行降维算法对所有测试图像计算时间均小于0.06 s，计算效率比串行方法有了10倍以上的提升，改进并行降维算法对所有测试图像计算时间均小于0.03 s，计算效率比串行方法有了20倍左右的提升。结论该方法实现了符号距离函数的快速并行计算，其优势在于当图像分辨率较高时仍然能够实现实时处理。

关键词

符号距离函数; 并行计算; 降维法; 抛物线下界法; 水平集

Parallel computing of signed distance function in level set based on dimension reduction

Jiang Shaofeng¹, Yang Suhua², Chen Zhen¹, Zhang Congxuan¹, Zhou Xuxin¹

1. School of Measuring and Optical Engineering, Nanchang Hangkong University, Nanchang 330063, China;

2. School of Information Engineering, Nanchang Hangkong University, Nanchang 330063, China

Supported by: National Natural Science Foundation of China (61162023)

Abstract

Objective Signed distance functions are the nearest distances between pixels and points on the closed curve in an image, with a negative sign in the curve and a positive sign outside the curve. The signed distance function has important applications in image processing, such as level set-based segmentation, 3D visual feature extraction, and pattern recognition in computer vision. The computational complexity of the signed distance function is O(N×M), where N is the number of pixels in an image, and M is the number of points on a closed curve. The high computational complexity of the signed distance function directly affects the computational efficiency of image processing with the increase in image resolution. For real-time processing of an image with high resolution, an improved real-time computing method for the signed distance function based on the dimension reduction method was proposed to improve the computational efficiency. Method Dimension reduction method transforms the 2D signed distance function into two independent 1D signed distance functions for each row (or column) of the image and uses lower parabola envelope-based method for calculating the 1D distance. The low-er parabola envelope-based method sequentially computes the lower envelope of the first q parabolas, where the parabolas are ordered according to the horizontal locations of their vertices. The computational complexity of the dimension reduction method is O(2N) and is one of the fastest methods for calculating the signed distance function. This paper first proposes a parallel dimension reduction method according to the computational independence of the signed distance function among the rows (or columns) in an image to reduce the computational time of the dimension reduction method. The parallel dimension reduction method calculates the signed distance functions of the different rows (or columns) in an image simultaneously by allowing each thread to correspond to a row (or column) in the image. Thus, the computational complexity of the proposed parallel dimension reduction method is reduced to O(2W+2H), where W and H are the width and height of the image, respectively. Second, this paper proposes an improved parallel dimension reduction method by running the lower parabola envelope-based method in a parallel manner to improve the computational efficiency further. The improved parallel dimension reduction method uses multi-threads in calculating the lower parabola envelope in different segments to perform the dimension reduction method by finding the intersection points between two neighboring parabolas in a segment simultaneously. All parallel processing steps were completed on CUDA platform for general parallel computing on GPU. The first step is calculating the sign by assigning H threads, and each thread should correspond to a row in the image. The second step is calculating the 1D distance by assigning W×H threads. Each thread should correspond to a pixel in the image and should scan from left to right of the image to touch the closed curve and set the scanning distance as the 1D distance of each pixel. The last step is calculating the 2D distance by assigning W×H threads. Each thread should correspond to a pixel in the image and should scan from top to bottom of the image to obtain the final distance using the proposed parallel lower parabola envelope-based method. The entire computational complexity of distance in this method is O(2W+kS), where k is the iterative times, and S is the length of the segment. Result Nine images with different image sizes (256×256, 1 280×1 280, and 2 560×2 560) and curve shapes were tested in our experiments. The computational time of three generating methods for signed distance function (the regular serial, the proposed parallel, and the improved parallel dimension reduction methods) was compared with the case in which the maximal error was below 1. The computational time of the parallel method was less than 0.06 s for all testing images and more than 10 times faster than that of the regular serial dimension reduction method. The computational time of the improved parallel method was less than 0.03 s for all testing images and approximately 20 times faster than that of the regular serial dimension reduction method. Conclusion The proposed parallel method for the signed distance function can generate the signed distance in tens of milliseconds. Thus, the proposed parallel method is sufficiently fast for real-time image processing, especially for high-resolution images.

Key words

signed distance function; parallel computing; dimension reduction method; lower parabola envelope based method; level set

0 引言

图像分割是图像处理领域中的一个关键技术，主要用于目标识别和理解，在基于图像的人工智能应用领域中起到非常重要的作用。Osher和Sethian^[1-2]提出的基于水平集的分割方法，由于能够很好地处理拓扑结构变化问题，近年来一直是图像分割领域中的研究热点。一般在水平集初始化及更新过程中，都需要计算符号距离函数(SDF)。所谓符号距离函数是指图像中的点到封闭曲线上点的最近距离。符号距离函数不仅用于水平集分割还广泛应用于计算机视觉中的3维视觉特征提取^[3]，模式识别^[4]等领域。符号距离函数的计算量较大(尤其对于高分辨率图像)，时间复杂度为$ {\rm{O}}\left( {N \times M} \right)$，$N $为图像点数，$M $为闭合曲线点数，故快速准确地计算符号距离函数对提高水平集方法的效率和稳定性至关重要。

光栅扫描法^[5]是早期的一种快速计算符号距离函数的方法，该方法用已经计算得到的距离去更新其特定邻域点的距离值。当采用先进先出的队列方式来计算时，理论上其计算复杂度为${\rm{O}}\left( {N \times L} \right) $，$L $为邻域点数, 但先进先出的队列不适合用并行算法实现。如果采用简单的像素点遍历方式来实现，一般需通过2次以上完整图像遍历才能完成所有像素点的符号距离计算，其算法复杂程度为${\rm{O}}\left( {K \times N \times L} \right) $，其中$K $为遍历图像的次数。由于图像中的每个点的计算是独立的，这种方法方便用并行方法来处理，并行计算的计算复杂度为$ {\rm{O}}\left( {K \times L} \right)$。光栅扫描法的主要缺点是当采用欧氏距离作为距离测度时，计算存在误差。为此从20世纪90年代开始，研究人员提出了源点扫描法^[6-8]，快速步进法^[9]和降维法^[10-11]。源点扫描法本质上也是光栅扫描法，其确认与每个网格点同处一条特性线(轮廓线的法线)上的轮廓线点(源点)，基于源点来计算符号距离，其计算复杂度为${\rm{O}}\left( {K \times N \times L} \right) $。快速步进法以封闭曲线为起点，从法线方向进行放射，采用快速匹配法计算符号距离，该方法的计算复杂度为${\rm{O}}\left( {N\ln M} \right) $。降维法将2维或更高维符号距离计算转换为1维符号距离计算，并采用抛物线下界法快速计算1维符号距离，其计算复杂度为${\rm{O}}\left( {2N} \right) $。张博^[12]提出了一种基于同心圆扩散的符号距离函数的生成算法，每个点以同心圆不断进行扩散，与封闭曲线相交得到距离值，该方法计算复杂度为${\rm{O}}\left( {T \times N} \right) $，$ T$为每个点平均扩散的次数。张博^[13]还提出了一种基于曲线推进的符号距离函数生成方法，该方法以距离1为步长推进边界，然后进行邻域最近点扫描得到符号距离，显然该方法也属于光栅扫描法，计算复杂度为${\rm{O}}\left( {N \times L} \right)$。

在降维法中图像每行和每列的符号距离计算是独立的，故本文提出了并行的降维法计算符号距离，计算复杂度为$ {\rm{O}}\left( {2W + 2H} \right)$，其中$W $为图像宽度，$H $为图像高度。该并行方法中采用的抛物线下界法仍然是串行的，在处理高分辨率图像时，计算效率不高。为进一步提高计算速度，本文提出了抛物线下界法的并行计算方法。2003年以来，GPU的运算能力就远远超过了CPU。由于在并行计算方面的优越性，除了进行传统的图形图像显示外，GPU还被用来进行如图像分割、配准等通用计算^[14-15]。NVIDIA公司于2007年在C语言的基础上正式发布一种新的统一计算架构CUDA, 稍微对C扩展即可编程进行并行计算。本文即是在CUDA平台上采用NVIDIA公司的GPU完成所有并行计算。

1 降维法计算符号距离函数

1.1 符号距离函数

符号距离函数定义为

$ u\left( {x, y} \right) = \left\{ \begin{array}{l} - d\left[{\left( {x, y} \right), C} \right]\;\;\;\;在曲线C内\\ 0\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;在曲线C上\\ d\left[{\left( {x, y} \right), C} \right]\;\;\;\;\;\;\;在曲线C外 \end{array} \right. $

(1)

式中，$d\left[{\left( {x, y} \right), C} \right] $表示点${\left( {x, y} \right)} $与闭合曲线$ C$之间的最近距离。构造符号距离函数一般分两步进行，首先构造符号表，即判断图像上每个网格点${\left( {x, y} \right)} $位于闭合曲线的内部还是外部，然后计算每个网格点的距离函数值，最后将两者相乘就得到每个网格点符号距离函数。

1.2 符号计算

采用张博等人^[12]提出的扫描方法构造符号表，具体步骤如下：

1) 构建曲线点形状，共分为图 1中的4种类型；

图 1 四种曲线点形状类型

Fig. 1 Four kinds of curve shape

2) 从图像中的每一行的第一个像素点(可以认为该点在曲线之外)沿该行从左向右进行扫描时, 如果遇到上下型、下上型两类边界时, 那么在该点处发生内外点的转换, 而遇到下下型、上上型两类边界时, 则不发生转换。当所有行都扫描完毕，内外点计算完成。

该方法中步骤1)可以用并行方法来完成，计算复杂度为O(1)，步骤2)中每行扫描也可以并行计算完成，复杂度为${\rm{O}}\left( W \right) $，$W $为图像宽度，故总体计算复杂度为${\rm{O}}\left( {1 + W} \right) $。

1.3 降维法

由式(1)定义的符号距离函数可以转变成坐标网格的优化问题^[10]，即

$ \begin{array}{l} \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;{D_f}\left( {x, y} \right) = \\ \mathop {\min }\limits_{x', y'} \left( {{{\left( {x-x'} \right)}^2} + {{\left( {y-y'} \right)}^2} + f\left( {x', y'} \right)} \right) \end{array} $

(2)

即对每个坐标点，在曲线上找到相应的${x', y'} $，使得上述距离值最小，实际上该优化问题用作求符号距离函数时$f\left( {x', y'} \right) = 0 $。降维法将2维符号距离计算转换为两次1维符号距离计算，即

$ \begin{array}{l} {D_f}\left( {x, y} \right) = \mathop {\min }\limits_{y'} \left( {{{\left( {y-y'} \right)}^2} + \mathop {\min }\limits_{x'} \left( {{{\left( {x-x'} \right)}^2} + } \right.} \right.\\ \left. {\left. {\;\;\;f\left( {x', y'} \right)} \right)} \right) = \mathop {\min }\limits_{y'} \left( {{{\left( {y-y'} \right)}^2} + {D_{f|y'}}\left( x \right)} \right) \end{array} $

(3)

$ {D_{f|y'}}\left( x \right) = \mathop {\min }\limits_{x'} \left( {{{\left( {x-x'} \right)}^2} + f\left( {x', y'} \right)} \right) $

(4)

式中，${D_{f|y'}}\left( x \right) $为沿X方向1维符号距离计算结果(如图 2(b)所示)，再沿Y方向做一次1维符号距离计算即得到${D_f}\left( {x, y} \right) $(如图 2(c)所示)。每次一维符号距离计算可等效为求一系列图像点对应的抛物线组的下边界得到(见图 3)。

图 2 降维法计算示意图

Fig. 2 Illustration of the dimension reduction method

((a) input closed curve; (b) transform of the input along each row; (c) final distance transform obtained by transforming the intermediate result along each column

图 3 抛物线下界法示意图

Fig. 3 The distance transform as the lower envelope of five parabolas

从图 3中可以看出, 抛物线0、2，4构成这组抛物线的下界，抛物线1和3被删除，而这个下界在每个点上的长度(加粗线段)就是该点的距离。抛物线的下界可以由求抛物线的交点得到。假设$r $，$q $，$z $为3个由左向右的坐标点，坐标点对应的抛物线分别为${\left( {x-r} \right)^2} + f\left( r \right) $，${\left( {x-q} \right)^2} + f\left( q \right) $，${\left( {x-z} \right)^2} + f\left( z \right) $则相邻抛物线的交点为

$ \begin{array}{l} {s_1} = \frac{{\left( {f\left( r \right) + {r^2}} \right)-\left( {f\left( q \right) + {q^2}} \right)}}{{2r-2q}}\\ {s_2} = \frac{{\left( {f\left( q \right) + {q^2}} \right)-\left( {f\left( z \right) + {z^2}} \right)}}{{2q - 2z}} \end{array} $

(5)

若交点$ {S_2}$在${S_1} $的左边则抛物线$q $被删除。1维抛物线下界法的流程图如图 4所示。

图 4 串行抛物线下界法流程图

Fig. 4 The flowchart of serial lower parabola envelope based method

2 并行降维法

显然在降维法计算中，图像的每行和每列的计算是独立的，可以并行完成，假设图像的图像宽度为$W $，高度为$H $，本文提出的并行降维法计算方法如下：

1) 计算${D_{f|{y_n}}}\left( {{x_m}} \right) $($ {x_m}$表示第$m $列，${y_n} $表示第$ n$行，$m = 1, 2, \cdots, W $, $n = 1, 2, \cdots, H $)。从左往右扫描，每个扫描点对应的抛物线为${\left( {x-{x_m}} \right)^2} $。如图 5所示，图中共有两个封闭曲线，在某行扫描时，共有4个边界点，相邻边界点的3个中点正好是对应抛物线的交点。相邻抛物线以交点为界即构成抛物线下界。因此计算${D_{f|{y_n}}}\left( {{x_m}} \right) $比较简单，只要计算相邻边界点的中点即可。具体并行计算过程为：分配$ H$个线程，每个线程对应图像中的一行，每行并行从左向右进行扫描，得到扫描线上的各个边界点和相邻边界点的中点，即可确定计算距离的抛物线下界；再依次扫描各点在抛物线下界的值即得到${D_{f|{y_n}}}\left( {{x_m}} \right) $，该步骤计算复杂度为${\rm{O}}\left( {2W} \right) $。

图 5 X方向扫描抛物线及下界分布图，5，10，13，18分别为边界点X坐标

Fig. 5 Scan parabolas and lower envelope along X direction, 5, 10, 13, 18 are the X coordinates of the closed curve

2) 计算${D_f}\left( {x, y} \right) $，分配$W $个线程，每个线程对应图像中的一列，每列并行从下向上进行扫描，每个扫描点对应的抛物线为${\left( {y-{y_n}} \right)^2} + {D_{f|{y_n}}}\left( {{x_m}} \right) $，分别建立两个队列，其中一个放置抛物线，一个放置抛物线交点，根据图 4所示流程依次进行抛物线删除判断，未删除的抛物线及当前交点放入队列，扫描完成后，队列中的抛物线和交点即构成抛物线下界；再依次扫描各点在抛物线下界的值即得到${D_f}\left( {x, y} \right) $，该步骤计算复杂度为$ {\rm{O}}\left( {2H} \right)$。

该并行算法计算复杂度为${\rm{O}}\left( {2W + 2H} \right) $，由于该并行方法中所用抛物线下界法仍然是串行的，在处理高分辨率图像时，计算效率不高。为进一步提高计算速度，本文对上述并行降维法进行了改进，提出了抛物线下界法的并行计算方法。

3 改进并行降维法

3.1 并行抛物线下界法

在抛物线下界方法中，最关键的步骤就是求交点和删除抛物线，在串行算法中，未删除抛物线按顺序进入队列，被删除的抛物线不进入队列，1维抛物线下界法的计算复杂度为${\rm{O}}\left( P \right) $，$P $为抛物线个数。为此本文将此过程改成并行完成，相邻两个抛物线两两同时求交点，将需删除的抛物线仅作删除标记，不采用队列记录抛物线，这样避免了大量的队列操作。但是仅仅通过一次两两计算，并不能删除所有需删除的抛物线，为此当有抛物线被删除时，向右或向左引入下一个点继续做两两计算，直到找不到能删除的抛物线为止。如图 6所示，具体计算步骤如下：

图 6 并行抛物线下界法示意图

Fig. 6 Illustration of the parallel lower parabola envelope based method

1) 向右计算阶段，并行计算每个点对应的抛物线$ {P_2}$和右边相邻点对应抛物线${P_3} $的交点，如果交点在前一个交点(${P_1} $和${P_2} $的交点)左边，则删除当前的抛物线${P_2} $，并做删除标志(空心点)，删除掉的抛物线不再参与后续计算，继续取后面一个点做为当前抛物线与后面的抛物线求交点，直到没有抛物线被删除(实心点)或者后一个抛物线是被删除的(方点)为止，这个点记为${P_6} $。

2) 向左计算阶段，并行计算每个点对应的抛物线${P_5} $和左边相邻点(且没有删除标志)对应抛物线${P_4} $的交点，如果交点在前一个交点(${P_5} $和${P_6} $的交点)右边，则删除当前的抛物线${P_5} $，继续取前面一个没有删除标志的抛物线为当前抛物线求交点，直到没有抛物线被删除为止。

该方法相当于对抛物线的下界计算进行了分段处理，图 6中向左和向右共有$S $个点，则段的长度为$S $，其串行规模为$S $，并行方法的计算复杂度为O($S $)。由于进行了分段操作，故在分段边界上可能存在不连续情况，故本计算需迭代进行，假设迭代次数为$k $(本文$k $最小为1，最大为4)则总的计算复杂度为$ {\rm{O}}\left( {kS} \right)$，一般情况下$kS \ll P $，故该并行方法比串行方法要快。下界得到后再依次扫描各点在抛物线下界的值即得到各点距离值，总的计算复杂度为$ {\rm{O}}\left( {kS + P} \right)$，详细的计算流程图见图 7。

图 7 并行抛物线下界法流程图

Fig. 7 The flowchart of the parallel lower parabola envelope based method

3.2 改进并行降维法实现

在CUDA平台上采用NVIDIA 950显卡进行通用GPU并行运算，为完成本文改进并行降维法，需将图像上每个像素点对应一个线程。由于像素点较多，需用多个线程块才能提供如此多的线程数。为此将整个图像作为一个网格，每个网格区域对应一个线程块，一个线程块中分配多个线程。如图 8，以一个大小为256×256像素的图像为例，在一个网格中分配32×8的线程块，也就是将图像分成32×8个块，每个线程块分配8×32个线程，这样一个线程就可以对应一个像素点的距离计算。具体计算步骤如下：

图 8 并行计算线程分配图

Fig. 8 The assignment of threads in parallel computing

1) 计算符号。分配$H $个线程，每个线程对应图像中的一行，每行并行从左向右进行扫描，利用曲线形状计算符号值，该步骤计算复杂度为${\rm{O}}\left( W \right) $。

2) 计算${D_{f|{y_n}}}\left( {{x_m}} \right) $。由于${D_{f|{y_n}}}\left( {{x_m}} \right) $计算比较简单，当每个像素点对应一个线程时，直接计算封闭曲线的点在X方向上到图像上每个点的最小距离即可得到${D_{f|{y_n}}}\left( {{x_m}} \right) $。具体步骤为：分配$W \times H $个线程，每个线程对应图像中的一个点，对于每个封闭曲线上的点(源点)以步长1分别向左和向右延伸到图像边界，经过的点的${D_{f|{y_n}}}\left( {{x_m}} \right) $即为延伸的步长数，记录其相应的源点；在进行延伸时，如果经过的点已有源点存在，则其存在多个源点，那么取离它最近的源点的距离作为${D_{f|{y_n}}}\left( {{x_m}} \right) $，该步骤计算复杂度为${\rm{O}}\left( W \right) $。

3) 计算${D_f}\left( {x, y} \right) $。分配$W \times H $个线程，每个线程对应图像中的一个点。根据流程图 7计算${D_f}\left( {x, y} \right) $，该过程迭代进行，直到计算误差小于1。该步骤计算复杂度最大为${\rm{O}}\left( {kS + W} \right) $，$k $为迭代次数，根据实验，$k $最小为1，最大为4。

4 实验及结果

本文的实验平台为联想台式计算机，Windows7 64位操作系统，intel i7-4790 CPU，主频3.6 GHz，8 GB内存，Nvidia GTX950显卡，算法均在CUDA3.2平台C语言实现。图 9(a)中的实验图像来自文献[12]，大小为256×256像素，该图像具有较为复杂的拓扑结构，适合用来测试符号距离函数的计算。图 9(b)是采用本文并行方法计算得到的符号图，图 9(c)为距离图，与真实值误差为0，计算时间只有1.1 ms。可见本文所提的并行符号距离函数计算方法既快又准确。

图 9 原始图像与符号距离函数

Fig. 9 The input image and signed distance function

((a) input image^[12]; (b) image of sign; (c) image of distance)

为比较计算效率，分别采用了256×256像素，1 280×1 280像素，2 560×2 560像素3种大小图像，每种大小图像采用3种不同大小的封闭曲线进行实验。在计算最大距离误差小于1的情况下，对降维法(串行)、降维法(并行)和改进降维法(并行)进行了比较，计算结果如表 1所示。由于采用正方形图像进行测试，故图像高度$H = W $，则根据前面的分析，降维法(串行)距离计算复杂度为${\rm{O}}\left( {2N} \right) $，$N = W \times W $，降维法(并行)距离计算复杂度为${\rm{O}}\left( {4W} \right) $，改进降维法(并行)距离计算复杂度为${\rm{O}}\left( {kS + 2W} \right) $，$ k$为循环次数，$ S$由程序自动判定，只要满足$kS < 2W $，改进并行算法就能提高计算效率，这里的复杂度均不包含符号计算复杂度。

表 1 计算效率比较
Table 1 The comparison of computing efficiency

下载CSV

算法	图像宽度/像素	曲线点数	循环次数	计算用时/ms	提速比	平均提速
降维法串	256 1 280 2 560	272/468/1 272 140/2 202/2 952 718/4 438/6 280	1 1 1	17/18/20 78/94/109 250/343/405	1	1

降维法并	256 1 280 2 560	272/468/1 272 140/2 202/2 952 718/4 438/6 280	1 1 1	1.1/1.2/1.6 4.5/10/8.4 12/41/54	15/15/12 17/9/13 20/8/8	13

改进降维法并	256 1 280 2 560	272/468/1 272 140/2 202/2 952 718/4 438/ 6 280	3 1/1/2 2/4/4	0.9/0.9/1.1 3.4/5.6/7.0 11/28/30	19/20/18 23/17/16 22/12/14	18

从表 1中可以看到，采用并行算法后，降维法(并行)计算效率平均提高了13倍左右，理论上提高倍数为$W/4 $，但由于并行资源有限，并不能达到该值。改进降维法(并行)计算效率平均提高了18倍左右，且当图像较大时，也就是对应$kS \ll 2W$的情况，并行的改进降维法比并行常规降维法快接近1倍。可见本文并行方法在符号距离函数的计算效率方面有了很大的提高。不同大小图像计算结果表明，图像越大，计算时间越长，不同曲线点数图像计算结果表明，曲线点数越多，计算时间越长，这也符合符号距离函数计算的基本规律。

5 结论

本文提出了并行的降维法计算符号距离，在处理高分辨率图像时，其计算时间仅有数十毫秒，可见本文实现了符号距离函数的高效计算，假设在用水平集进行分割时须进行10次符号距离函数计算，那么本方法为此耗时少于1 s，而常规方法大大高于1 s，故本方法适用于2维、3维甚至4维图像的实时处理。虽然本文方法取得了很好的计算效率，但是需要少量迭代计算，且迭代次数与图像大小和曲线点数有关，这样对算法的适用性有一定影响。为此下一步工作将进一步研究如何将迭代次数减少到2甚至1即能满足精度要求，从而进一步提高计算效率和算法适应性。

参考文献

[1] Sethian J A. Level Set Methods and Fast Marching Methods:Evolving Interfaces in Computational Geometry, Fluid Mechanics, Computer Vision, and Materials Science[M]. London: Cambridge University Press, 1999.

[2] Osher S, Sethian J A. Fronts propagating with curvature-dependent speed:algorithms based on Hamilton-Jacobi formulation[J]. Journal of Computational Physics, 1988, 79(1): 12–49. [DOI:10.1016/0021-9991(88)90002-2]

[3] Sharma O, Agarwal N. Signed distance based 3D surface reconstruction from unorganized planar cross-sections[J]. Computers & Graphics, 2016, 62: 67–76. [DOI:10.1016/j.cag.2016.12.002]

[4] Guo D, Li C Q, Wu L, et al. Improved marching tetrahedra algorithm based on hierarchical signed distance field and multi-scale depth map fusion for 3D reconstruction[J]. Journal of Visual Communication and Image Representation, 2017, 48: 491–501. [DOI:10.1016/j.jvcir.2016.12.016]

[5] Fabbri R, Costa L D F, Torelli J C, et al. 2D Euclidean distance transform algorithms:a comparative survey[J]. ACM Computing Surveys, 2008, 40(1): #2. [DOI:10.1145/1322432.1322434]

[6] Tsai Y H R. Rapid and accurate computation of the distance function using grids[J]. Journal of Computational Physics, 2002, 178(1): 175–195. [DOI:10.1006/jcph.2002.7028]

[7] Li J, Yang X, Shi P F. A fast level set approach to image segmentation based on mumford-shah model[J]. Chinese Journal of Computers, 2002, 25(11): 1175–1183. [李俊, 杨新, 施鹏飞. 基于Mumford-Shah模型的快速水平集图像分割方法[J]. 计算机学报, 2002, 25(11): 1175–1183. ] [DOI:10.3321/j.issn:0254-4164.2002.11.008]

[8] Dong J W, Yang H Y, Zhang B. Fast generation of signed distance function in level set method[J]. Computer Engineering and Application, 2009, 45(34): 180–182. [董吉文, 杨海英, 张冰. 水平集方法中符号距离函数的快速生成[J]. 计算机工程与应用, 2009, 45(34): 180–182. ] [DOI:10.3778/j.issn.1002-8331.2009.34.056]

[9] Sethian J A. A fast marching level set method for monotonically advancing fronts[J]. Proceedings of the National Academy of Sciences of the United States of America, 1996, 93(4): 1591–1595. [DOI:10.1073/pnas.93.4.1591]

[10] Felzenszwalb P F, Huttenlocher D P. Distance transforms of sampled functions[J]. Theory of Computing, 2012, 8(19): 415–428. [DOI:10.4086/toc.2012.v008a019]

[11] Mishchenko Y. A fast algorithm for computation of discrete Euclidean distance transform in three or more dimensions on vector processing architectures[J]. Signal, Image and Video Processing, 2015, 9(1): 19–27. [DOI:10.1007/s11760-012-0419-9]

[12] Zhang B, Su Y L, Zhang S L. Method for generating the signed distance function based on curve marching[J]. Journal of Northwest University:Natural Science Edition, 2007, 37(5): 697–700. [张博, 苏永利, 张书玲. 基于曲线推进的符号距离函数生成方法[J]. 西北大学学报:自然科学版, 2007, 37(5): 697–700. ] [DOI:10.16152/j.cnki.xdxbzr.2007.05.033]

[13] Zhang B, Su Y L. A fast method for signed distance function generation[J]. Computer Applications and Software, 2008, 25(6): 102–103, 112. [张博, 苏永利. 一种快速的符号距离函数的生成方法[J]. 计算机应用与软件, 2008, 25(6): 102–103, 112. ] [DOI:10.3969/j.issn.1000-386X.2008.06.041]

[14] Feng C L, Zhao D Z, Huang M. Image segmentation using CUDA accelerated non-local means denoising and bias correction embedded fuzzy c-means (BCEFCM)[J]. Signal Processing, 2016, 122: 164–189. [DOI:10.1016/j.sigpro.2015.12.007]

[15] Fluck O, Vetter C, Wein W, et al. A survey of medical image registration on graphics hardware[J]. Computer Methods and Programs in Biomedicine, 2011, 104(3): e45–e57. [DOI:10.1016/j.cmpb.2010.10.009]