发布时间: 2018-06-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.170492
2018 | Volume 23 | Number 6

图像分析和识别

背景吸收的马尔可夫显著性目标检测

蒋峰岭^1,3, 张海涛⁴, 杨静^1,2, 孔斌^1,2

1. 中国科学院合肥物质科学研究院智能机械研究所, 合肥 230031;

2. 中国科学技术大学, 合肥 230026;

3. 合肥师范学院计算机学院, 合肥 230061;

4. 合肥工业大学计算机与信息学院, 合肥 230009

收稿日期: 2017-09-08; 修回日期: 2017-12-08

基金项目: 国家自然科学基金项目（913203002）；广东省省级科技计划重大科技专项基金项目（c1632611500006））；合肥师范学院校级科研基金项目（2017QN19）

第一作者简介: 蒋峰岭(1986-), 男, 汉, 讲师, 现为中国科学技术大学模式识别与智能系统专业博士研究生, 主要研究方向为计算机视觉。E-mail:fljiang@mail.ustc.edu.cn.

中图法分类号: TP301.6

文献标识码: A

文章编号: 1006-8961(2018)06-0857-09

摘要

目的现有的基于马尔可夫链的显著目标检测方法是使用简单线性迭代聚类（SLIC）分割获得超像素块构造图的节点，再将四边界进行复制获得吸收节点，但是SLIC分割结果的好坏直接影响到后续结果，另外，很大一部分图像的显著目标会占据1~2个边界，特别是对于人像、雕塑等，如果直接使用四边界作为待复制的节点，必然影响最终效果。针对以上存在的缺陷，本文提出一种背景吸收的马尔可夫显著目标检测方法。方法首先通过差异化筛选去除差异较大的边界，选择剩余3条边界上的节点进行复制作为马尔可夫链的吸收节点，通过计算转移节点的吸收时间获得初始的显著图，从初始显著图中选择背景可能性较大的节点进行复制作为吸收节点，再进行一次重吸收计算获得显著图，并对多层显著图进行融合获得最终的显著图。结果在ASD、DUT-OMRON和SED 3个公开数据库上，对比实验验证本文方法，与目前12种主流算法相比，在PR曲线、F值和直观上均有明显的提高，3个数据库计算出的F值分别为0.903、0.544 7、0.775 6，验证了算法的有效性。结论本文针对使用图像边界的超像素块复制作为吸收节点和SLIC分割技术的缺陷，提出了一种基于背景吸收马尔可夫显著目标检测模型，实验表明，本文的方法适用于自底向上的图像显著目标检测，特别是适用于存在人像或雕塑等目标的图像，并且可以应用于图像检索、目标识别、图像分割和图像压缩等多个领域。

关键词

目标检测，SLIC分割; 显著性检测; 马尔可夫链; 吸收节点; 多层图

Image saliency detection based on background-absorbing Markov chain

Jiang Fengling^1,3, Zhang Haitao⁴, Yang Jing^1,2, Kong Bin^1,2

1. Institute of Intelligent Machines, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, China;

2. University of Science and Technology of China, Hefei 230026, China;

3. School of Computer Science and Technology, Hefei Normal University, Hefei 230061, China;

4. School of Computer and Information, Hefei University of Technology, Hefei 230009, China

Supported by: National Natural Science Foundation of China (913203002)

Abstract

Objective The method of saliency detection via absorbing Markov chain uses simple linear iterative cluster (SLIC) method to obtain superpixels as graph nodes. Then, a k-regular graph is constructed, and the edge weight is calculated by the difference of two nodes on CIELAB value. The superpixels are duplicated on boundary as absorbing nodes. Then, the absorbed time of transient nodes on the Markov chain is calculated. If the absorbed time is small, then the transient node is similar to the absorbing node, which is possibly a background node. On the contrary, the transient node is dissimilar to the absorbing node when the absorbed time is large. Then, the node is a salient node. Actually, the number of superpixels obtained by using the SLIC method influences the results of the saliency maps. If the size of the superpixel is extremely large, then the detailed information is ignored. On the contrary, if the size of the superpixel is extremely small, then the global cue is missed. The saliency objects usually occupy one or two boundaries of images, especially for portraits and sculptures. The final saliency results are influenced when four boundary nodes are duplicated as absorbing nodes. Considering these drawbacks, we propose an improved method based on background nodes as absorbing nodes on absorbing Markov chain. Multilayer image fusion is used to restrain the influence of the uncertain number of superpixels through the SLIC method. Method First, we confirm the edge selection. We separately obtain the four boundaries of the image nodes by SLIC method, duplicate them as absorbing nodes, and obtain four types of saliency maps. Then, we calculate the difference values of two saliency maps. The largest value of one saliency map is computed and compared with those of the other three maps. Then, we remove the most different edge and continue to use the remaining three boundary nodes to be duplicated as absorbing nodes. Then, the initial saliency map can be obtained by calculating the absorbed time of the transient nodes to absorbing nodes at the absorbing Markov chain. Second, to further optimize the algorithm, the number of absorbing nodes should be increased. In addition, the background may be a part of the boundary nodes. We add other nodes that are probably background nodes, which are selected from the initial saliency map by a threshold. If the initial saliency value of the node is lower than the threshold, then the node is considered a background node. The selected boundary and background nodes are duplicated as absorbing nodes, by which the absorbed time of the transient nodes are calculated. The pixel saliency value can be obtained from the saliency value of the superpixels. Finally, we fuse the multiple pixel saliency maps and calculate the average values as the final results. The multiple saliency values are obtained via different numbers of superpixels by using the SLIC method. Result We evaluate the effectiveness of our method on the following three benchmark datasets: ASD, DUT-OMRON, and SED. We compare our method with 12 recent state-of-art saliency detection methods, namely, MC, CA, FT, SEG, BM, SWD, SF, GCHC, LMLC, PCA, MS, and MST. ASD contains 1 000 simple images. DUT-OMRONS contains 5 168 complex images. SED includes 200 images, in which 100 images have one salient object, and the other 100 images have two salient objects. The experimental results show that the improved algorithm is efficient and better than the 12 state-of-the-art methods in precision recall curves (PR-curve) and F-measure. The precision is calculated as the ratio of actual saliency results assigned to all predicted saliency pixels. Recall is defined as the ratio of the total saliency captured pixels to the ground-truth number. F-measure is an overall performance measurement. Some visual comparative examples selected from three datasets are shown intuitively. The F-measure values of the three benchmark databases are 0.775 6, 0.903, and 0.544 7. These values are higher than that of the other 12 methods. Conclusion The image can be segmented into complex background, and the partition of human eyes is interesting. The visual image saliency detection extracts the portion of interest through the computer simulation as a human visual system. We propose an improved model, which is based on the background nodes and image fusion, to obtain the final results given the drawbacks of using four image boundaries to be duplicated as absorbing nodes and the uncertain number of superpixels through the SLIC method. The experiments show that the method is efficient and can be applicable to the bottom-up image saliency detection, especially for images such as portraits or sculptures. It can also be applied to many fields, such as image retrieval, object recognition, image segmentation, and image compression.

Key words

object detection; SLIC method; saliency detection; Markov chain; absorbing node; multilayer image

0 引言

近年来，随着网络的发展，图像资源变得越来越丰富。图像具有复杂的背景和人眼感兴趣的部分，图像的显著性检测就是通过计算机模拟人类的视觉系统把感兴趣的部分提取出来。显著性目标检测已经成为计算机视觉研究领域的一个重要研究方向，它可以应用于图像检索^[1-2]、目标识别^[3]、图像分割^[4]和图像压缩^[5]等多个领域。随着几十年研究的不断发展，研究学者将显著性目标检测总结为两个类型^[6]，一种是自底向上的方法^[7-9]，一种是自顶向下的^[10]。自顶向下是基于任务的，需要一些先验知识。本文关注的是自底向上的方法，这类方法不需要任何的先验知识，能更快速地、准确地定位目标，筛选出感兴趣的区域作为显著性目标。

目前国内外提出了很多优秀的视觉显著性检测算法。在1998年，具有奠基性意义的Itti等人^[11]提出基于Koch生物识别框架，提取了多尺度的亮度、颜色、方向特征，采用中央周边差操作构造特征图，再将特征整合后突出显著性区域。Harel等人^[12]在2006年提出了一种基于图的视觉显著性检测模型；Hou等人^[13]在2007年提出了一个频谱残差模型(SR方法)来计算显著性；2013年Jiang等人^[14]提出的基于马尔可夫链的方法(MC方法)中，关注节点在马尔可夫链上的随机游走过程，根据转移节点到吸收节点的吸收时间的不同，获得超像素块的显著值，结果如图 1(c)所示；2015年Qin等人^[15]提出了基于元胞自动机的方法；2016年Tu等人^[16]引入最小生成树的方法，可以获得一个实时显著性检测方法。

图 1 改进方法的结果展示图

Fig. 1 Results of our improved method

((a)input image; (b)ground truth; (c)results of MC method; (d)results of our saliency map)

MC方法中使用简单的线性迭代聚类(SLIC)分割技术^[17]获取超像素块作为马尔可夫链中的节点。另外，论文中指出，4边界超像素块一般不含有显著目标，可将4边界的超像素块复制为虚拟吸收节点，从而计算转移节点的吸收时间。根据自然图像本身的特性、拍照者的构图习惯，有很大一部分图像的显著目标会占据1~2个边界，特别是对于人像、雕塑等，单纯地将4条边界复制作为虚拟的吸收节点有待进一步提高^[18]。SLIC分割结果的好坏直接影响到后续结果，可以采用多尺度等^[19-20]方法进行改进。鉴于以上分析，提出了一种基于背景吸收马尔可夫链的显著区域计算模型，结果如图 1(d)所示。图 1(a)(b)分别给出了输入图像和真值图。

1 基于吸收马尔可夫链的显著性检测介绍

在MC方法^[14]中，首先釆用SLIC算法将输入图像过分割成$m $个超像素块，每个超像素块作为一个节点。一般认为图像的4边界上(上、下、左、右)的超像素块(假设为$ k $个)不含有显著目标，可以将其作为待复制的节点，复制的$ k $个虚拟的节点作为吸收马尔可夫链中的吸收节点，使用$ n$个节点$ \left( {k + m = n} \right) $建立一个图模型$ \mathit{\boldsymbol{G}}\left( {\mathit{\boldsymbol{V}}, \mathit{\boldsymbol{E}}} \right) $，其中，$ \mathit{\boldsymbol{V}} $代表节点集合，$ \mathit{\boldsymbol{E}} $是边的集合，节点之间的边符合以下规则^[14]：

1) 图像内$ m $个超像素块对应节点之间的关系：相邻超像素块对应的节点之间关联；且有相同邻居的超像素块对应的节点之间也关联；边界上的所有超像素块对应的节点之间相互关联。

2) 复制的$ k $个虚拟的节点之间的关系：$ k $个虚拟的节点之间相互不关联。

3) 图像内的$ m $个超像素块对应的节点与复制的$ k $个虚拟的节点之间的关系：假设$ i $是图像内边界上的超像素块对应的节点，则复制生成新的节点$i'$，与$ i $关联的所有节点均与$i'$关联，且$ i $与$i'$之间关联。

根据上述规则，获得图$ \mathit{\boldsymbol{G}} $中节点的关联矩阵$ \mathit{\boldsymbol{A}} $

$ {a_{ij}} = \left\{ {\begin{array}{*{20}{l}} {{w_{ij}}}&{j \in \mathit{\boldsymbol{N}}\left( i \right), 1 \le i \le n}\\ 1&{i = j}\\ 0&{{\rm{其他}}} \end{array}} \right. $

(1)

式中，$ \mathit{\boldsymbol{N}}\left( i \right) $表示与节点$ i $相关联的节点的集合。关联节点$ i $和$ j $之间权重${w_{ij}} $定义为

$ {w_{ij}} = {{\rm{e}}^{\frac{{\left\| {{x_i} - {x_j}} \right\|}}{{{\sigma ^2}}}}} $

(2)

式中，$ {\mathit{\boldsymbol{x}}_i} $和$ {\mathit{\boldsymbol{x}}_j} $表示顶点的特征向量，即对应的超像素块在CIELAB颜色空间的颜色均值，$ \sigma $是一个常数。

根据关联矩阵$ \mathit{\boldsymbol{A}} $，可以计算出度矩阵$ \mathit{\boldsymbol{D}} = {\rm{diag}}(\sum\limits_j {{a_{ij}}} )$，从而得到图$ \mathit{\boldsymbol{G}} $中节点的转移矩阵$ \mathit{\boldsymbol{P}} = {\mathit{\boldsymbol{D}}^{ - 1}} \times \mathit{\boldsymbol{A}} $。

将图像内$ m $个超像素块对应的节点看作吸收马尔可夫链中的转移节点，复制的$ k $个超像素块对应的节点为吸收节点，将$ k $个吸收状态和$ m $个转移状态汇集到一起，并将节点重新排列，则一个$ n \times n $状态转移矩阵$ \mathit{\boldsymbol{P}} $为

$ \mathit{\boldsymbol{P}} = \left[ {\begin{array}{*{20}{c}} \mathit{\boldsymbol{Q}}&\mathit{\boldsymbol{R}}\\ {\mathit{\boldsymbol{ }}{\bf{0 }}}&\mathit{\boldsymbol{I}} \end{array}} \right] $

(3)

式中，$\mathit{\boldsymbol{Q}} \in {\left[ {0, 1} \right]_{m \times m}} $表示$ m $个转移节点之间的概率转移矩阵，$ \mathit{\boldsymbol{R}} \in {\left[ {0, 1} \right]_{m \times k}} $中的元素代表转移节点与吸收节点之间的转移概率，${\bf{0}} $是一个$k \times m $零矩阵，$ \mathit{\boldsymbol{I}} $是由吸收节点组成的单位矩阵。

对一个吸收链，可以获得基本矩阵为$ \mathit{\boldsymbol{N}} = {\left( {\mathit{\boldsymbol{I}} - \mathit{\boldsymbol{Q}}} \right)^{ - 1}} $，$ {n_{ij}} $表示从转移节点$ i $到转移节点$ j $的预期吸收时间，$ \sum\limits_i {{n_{ij}}} $表示节点$ i $的总的吸收时间，从而计算出所有转移节点的吸收时间组成的向量$ \mathit{\boldsymbol{y}} = \mathit{\boldsymbol{N}} \times \mathit{\boldsymbol{I}} $，其中$ \mathit{\boldsymbol{I}} $是一个元素全为1的向量。据马尔可夫链的性质，与吸收节点(四边界上的超像素块)越相似，吸收的越快，吸收时间越短，$ \mathit{\boldsymbol{y}} $越小，则该节点对应的超像素块越可能是背景；反之，如果吸收时间越长，$ \mathit{\boldsymbol{y}} $值越大，则越可能为显著目标，从而将$ \mathit{\boldsymbol{y}} $归一化得到显著值$ \mathit{\boldsymbol{s}} $，即$ s\left( i \right) = \bar y\left( i \right), i = 1, 2, \ldots , m, i $表示图中所有的转移状态节点，$ \mathit{\boldsymbol{\bar y}} $表示归一化后的吸收时间向量。更详细的过程可查看MC方法^[14]。

2 基于背景吸收的马尔可夫显著性目标检测

2.1 边界选择

一般地图像的4个边界为背景的可能性较大。但是，针对一些人像、雕塑等图像，显著目标区域总是占据1~2个边界，显然将全部4个边界都看做为背景是不准确的。鉴于此，我们对比4个边界上的超像素块，去除掉差异值最大的边界，选择其他3个边界上的超像素块作为待复制节点，复制的虚拟节点为吸收节点。

首先，分别使用4条边界复制得到吸收节点，再进行4次计算，分别计算吸收时间，从而获得4个显著性$ {\mathit{\boldsymbol{s}}^{{\rm{top}}}} $, $ {\mathit{\boldsymbol{s}}^{{\rm{down}}}} $, $ {\mathit{\boldsymbol{s}}^{{\rm{left}}}} $, $ {\mathit{\boldsymbol{s}}^{{\rm{right}}}} $。再计算4个结果中，两两之间的差异，即

$ \begin{array}{l} {h^{uv}} = \sum\limits_{i = 1}^m {\sqrt {{{(\mathit{\boldsymbol{s}}_i^u - \mathit{\boldsymbol{s}}_i^v)}^2}} } \\ u, v \in \left\{ {{\rm{top, down, left, right}}} \right\} \end{array} $

(4)

式中，top, down, left, right分别表示取上、下、左、右边界复制作为吸收节点，$ i $为图像内分割的超像素块号，取值为$ \left\{ {1, 2, \ldots , m} \right\} $，生成差异矩阵$ \mathit{\boldsymbol{H}} = \{ h_{ij}^{uv}\} $。

在差异矩阵$ \mathit{\boldsymbol{H}} $中，第$ i $行的和即为某显著图与其他3个显著图的差异和，找到行和最大的值，从而计算出差异最大的显著图。最后，计算$ \mathop {{\rm{max}}}\limits_i \;{\mathit{\boldsymbol{h}}_i} = \mathop {{\rm{max}}}\limits_i \sum\limits_j {{h_{ij}}} $，找到生成显著图差异最大的边界去除，选择剩余的3个边界。过程如图 2所示。

图 2 本方法的流程图

Fig. 2 Process of the improved method

2.2 背景吸收节点选择

为了进一步优化算法，需要增加吸收节点的个数，除了部分边界可能是背景以外，再增加一些可能是背景的节点复制作为吸收节点。如图 2所示。首先，使用2.1节方法选择的3条边界的节点作为待复制节点，复制得到的节点作为吸收节点，计算图像内所有超像素块的吸收时间，获得一个初始显著图${\mathit{\boldsymbol{s}}^{{\rm{initial}}}} $，再将初始显著图的值与阈值$ t $(根据实验经验获得)进行对比，如果初始显著值小于阈值$ t $，则认为节点$ i $为背景的可能性较大，则将其加入到待复制的节点集合当中，即

$ \mathit{\boldsymbol{C}} = \{ i|\mathit{\boldsymbol{s}}_i^{{\rm{initial}}} \le t\} \cup \left\{ {i|i \in \mathit{\boldsymbol{B}}} \right\} $

(5)

式中，$ \mathit{\boldsymbol{C}} $表示全部待复制的节点的集合，$ \mathit{\boldsymbol{B}} $表示选择的3边界上超像素块对应的节点的集合，$ \mathit{\boldsymbol{s}}_i^{{\rm{initial}}} $表示节点$ i $的初始显著值，$ t $是控制选择背景吸收节点的阈值。

2.3 多层图融合

超像素块分割算法在很大程度上降低了图像后续复杂的处理，但是，不同的超像素块的分割会产生不一样的结果。在定义超像素块个数越多，则每个超像素块的大小越小，时间复杂度越高，从而在图像的细节上能得到充分处理，区域整体性会变差；超像素块个数越少，则每个超像素块的大小越大，时间复杂度越低，但是在图像的细节上不能得到充分处理，区域整体性会变好。针对SLIC分割特点，分割不同的超像素个数，分别构造多个图，即多层图^[21]。通过多层图融合以减小SLIC算法的误差。首先将获得的超像素块的结果${\mathit{\boldsymbol{s}}^{{\rm{middle}}, l}} $($ l $表示第$ l $层，超像素块个数为$ m_l $个)指定给超像素块中的每个像素点获得矩阵$ {\mathit{\boldsymbol{F}}^{{\rm{middle, }}l}} $($W \times H $个像素点，$ W$为输入图像的宽度，$ H $为输入图像的高度)，再将其获得的多层像素点级的结果$ {\mathit{\boldsymbol{F}}^{{\rm{middle, }}l}} $取均值进行融合获得$ {\mathit{\boldsymbol{F}}^{{\rm{final}}}} $，即

$ f_{\mathit{\boldsymbol{i}}j}^{{\rm{final}}} = \frac{{\sum\limits_{l = 1}^L {f_{ij}^{{\rm{middle}}, l}} }}{L} $

(6)

式中，$ L $为多层图的层数，$ f_{ij}^{{\rm{middle}}, l} $为矩阵$ {\mathit{\boldsymbol{F}}^{{\rm{middle}}, l}} $的第$ i $行，第$ j $列，即为第$ l $的像素点的显著值；$ f_{ij}^{{\rm{final}}} $为本文最终获得的第$ i $行，第$ j $列的像素点的显著值。

2.4 本文算法

根据上述分析，总结本文算法如下：

1) 输入一幅图像和一些必要的参数。

2) 将输入图像使用SLIC分割，获得$ m $块超像素块。

3) 根据边界选择过程，使用式(4)，选择出需要的3条边界。

4) 分$ L $层进行计算，其中第$l $层为：

(1) 根据式(5)获得待复制的节点集合$ \mathit{\boldsymbol{C}} $；

(2) 将$\mathit{\boldsymbol{C}} $中节点进行复制，构造图，获得转移矩阵$ \mathit{\boldsymbol{P}}$如式(3)所示，计算$\mathit{\boldsymbol{N}} = {\left( {\mathit{\boldsymbol{I}} - \mathit{\boldsymbol{Q}}} \right)^{ - 1}} $，$ \mathit{\boldsymbol{y}} = \mathit{\boldsymbol{N}} \times 1$，$ \mathit{\boldsymbol{s}}\left( i \right) = \mathit{\boldsymbol{\bar y}}\left( i \right) $，获得${s^{{\rm{initial}}}} $；

(3) 比较$ {s^{{\rm{initial}}}} $各节点的显著值与阈值$ t $的关系，如果$ s_i^{{\rm{initial}}} \le t $，则选择超像素块$ i $对应的节点合并3边界节点进行复制，共同作为背景吸收节点；

(4) 重新构造图，重复步骤(2)，获得超像素块级的${\mathit{\boldsymbol{s}}^{{\rm{middle}}}} $；

(5) 将$ {\mathit{\boldsymbol{s}}^{{\rm{middle}}}} $指定给超像素块内的像素点，获得像素点级的显著值$ {\mathit{\boldsymbol{F}}^{{\rm{middle}}}} $。

5) 对$ L$层获得的每个像素点的显著值均值融合，使用式(6)，获得最终的显著图$ {\mathit{\boldsymbol{F}}^{{\rm{final}}}} $。

最后，输出最终的像素点级的显著图$ {\mathit{\boldsymbol{F}}^{{\rm{final}}}} $。

3 实验结果与分析

实验在ASD、DUT-OMRON和SED3个数据集上，对比了现有的12种方法，分别为MC^[14]，CA^[22]，FT^[7]，SEG^[23]，BM^[24]，SWD^[25]，SF^[26]，GCHC^[27]，LMLC^[28]，PCA^[29]，MS^[19]和MST^[16]。ASD数据集^[7]包含1 000幅自然图像；DUT-OMRON数据集^[30]中有5 168幅具有复杂背景且显著目标大小不一的图片；SED数据集^[31]中有200幅具有挑战性的图像。设置初始的边界选择时的参数超像素个数为$ m = 250 $，获取背景吸收节点时的阈值$ t = 0.015 $。

通过实验验证，多层图比单层图获得的显著结果有明显提高。如图 3所示，在数据集ASD上，当分别取$ L = 2, 3, 4, 5 $时，整体结果较稳定，但均比$L = 1 $有明显提升。综合而言，本文取多层图的层次为$L = 3 $，每层的超像素块数分别设置为$ {m_1} = 200 $，$ {m_2} = 250 $和$ {m_3} = 300 $。

图 3 多层图选择

Fig. 3 The selection number of multilayer image

本文采用$ PR$(准确率—召回率)曲线和F值来衡量算法的有效性。其中$ PR $曲线是对由某个显著性算法得到的每一幅显著性图像，分别使用[0, 255]的阈值对显著结果进行二值分割，并将之与真值进行对比获得的。在进行显著性效果评价时，准确率$ (P) $与召回率$ (R) $往往互相影响，有些算法准确率提高时召回率可能下降。因此经常使用F值来衡量算法的整体性能，F值的计算为

$ {\rm{F}} = \frac{{\left( {1 + \beta } \right)P \times R}}{{\beta \times P + R}} $

(7)

实验中设置$ \beta = 0.3$^{[14, 16]}。

图 4显示了多种方法在ASD数据集上的$ PR $曲线和F值结果。图 5和图 6是在DUT-OMRON数据集和SED数据集进行对比的结果。从PR曲线和F值结果可以看出，在ASD和DUT-OMRON数据集上，PR曲线和F值均是最高的，SED数据集中，$ PR $曲线是最好的，F值略低于MC方法，但整体而言，本文算法优于其他12种现有的算法。

图 4 多种算法在ASD数据集上PR曲线与F值比较

Fig. 4 Precision-Recall curves and F-measure histogram of different methods on ASD dataset

图 5 多种算法在DUT-OMRON数据集上PR曲线与F值比较

Fig. 5 Precision-Recall curves and F-measure histogram of different methods on DUT-OMRON dataset

图 6 多种算法在SED数据集上PR曲线与F值比较

Fig. 6 Precision-Recall curves and F-measure histogram of different methods on SED dataset

在图 7中给出在3种不同的数据集上，各种算法的视觉比较，从图 7可以看出本文的显著图结果在视觉效果上能够更好地突出整个目标，从直观上来看，本文算法优于其他12种算法。

图 7 在3种数据库上不同方法的视觉比较

Fig. 7 Comparison of different methods on the three datasets

4 结论

本文在原有的基于吸收马尔可夫目标检测的模型中，针对吸收节点的选择和SLIC分割技术的缺陷，提出了一种基于边界选择、背景节点选择的多层图融合模型。改进的基于吸收马尔可夫链的显著性算法，经过在3个公开数据集上与现有的12种方法对比，实验结果证明其在$ PR $曲线和F值上均有明显的提高。

本文模型中计算节点与节点之间的关系时，仅考虑吸收时间，后续会考虑综合吸收时间、击中时间等多指标进行综合评定获取显著值。另外，多层图的融合本文仅使用了简单的均值，后续将会考虑使用概率模型等进行更有效的融合。

参考文献

[1] Zhu G Y, Zheng Y F, Doermann D, et al. Signature detection and matching for document image retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(11): 2015–2031. [DOI:10.1109/TPAMI.2008.237]

[2] Wu J F. The research of image retrieval algorithm based on visual saliency[D]. Dalian: Dalian Maritime University, 2017. [吴俊峰. 基于视觉显著性的图像检索算法研究[D]. 大连: 大连海事大学, 2017.] http://cdmd.cnki.com.cn/Article/CDMD-10151-1017054261.htm

[3] Rutishauser U, Walther D, Koch C, et al. Is bottom-up attention useful for object recognition?[C]//Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington DC, USA: IEEE, 2004, 2: Ⅱ-37-Ⅱ-44. [DOI:10.1109/CVPR.2004.1315142]

[4] Ko B C, Nam J Y. Object-of-interest image segmentation based on human attention and semantic region clustering[J]. Journal of the Optical Society of America A, 2006, 23(10): 2462–2470. [DOI:10.1364/JOSAA.23.002462]

[5] Zhang G X, Cheng M M, Hu S M, et al. A shape-preserving approach to image resizing[J]. Computer Graphics Forum, 2009, 28(7): 1897–1906. [DOI:10.1111/j.1467-8659.2009.01568.x]

[6] Ao H H. Research on applications based on visual saliency[D]. Hefei: University of Science and Technology of China, 2013. [敖欢欢. 视觉显著性应用研究[D]. 合肥: 中国科学技术大学, 2013. [DOI:10.7666/d.Y2354193]]

[7] Achanta R, Hemami S, Estrada F, et al. Frequency-tuned salient region detection[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Miami, Florida, USA: IEEE, 2009: 1597-1604. [DOI:10.1109/CVPR.2009.5206596]

[8] Li B, Lu C Y, Jin L B, et al. Saliency detection based on lazy random walk[J]. Journal of Image and Graphics, 2016, 21(9): 1191–1201. [李波, 卢春园, 金连宝, 等. 惰性随机游走视觉显著性检测算法[J]. 中国图象图形学报, 2016, 21(9): 1191–1201. ] [DOI:10.11834/jig.20160908]

[9] Xu W, Tang Z M. Exploiting hierarchical prior estimation for salient object detection[J]. Acta Automatica Sinica, 2015, 41(4): 799–812. [徐威, 唐振民. 利用层次先验估计的显著性目标检测[J]. 自动化学报, 2015, 41(4): 799–812. ] [DOI:10.16383/j.aas.2015.c140281]

[10] Borji A, Sihite D N, Itti L. Probabilistic learning of task-specific visual attention[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Providence, Rhode Island, USA: IEEE, 2012: 470-477. [DOI:10.1109/CVPR.2012.6247710]

[11] Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(11): 1254–1259. [DOI:10.1109/34.730558]

[12] Harel J, Koch C, Perona P. Graph-based visual saliency[C]//Proceedings of the 19th International Conference on Neural Information Processing Systems. Kitakyushu, Japan: ACM, 2006: 545-552. [DOI:10.1.1.70.2254]

[13] Hou X D, Zhang L Q. Saliency detection: a spectral residual approach[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, Minnesota, USA: IEEE, 2007: 1-8. [DOI:10.1109/CVPR.2007.383267]

[14] Jiang B W, Zhang L H, Lu H C, et al. Saliency detection via absorbing Markov chain[C]//Proceedings of IEEE International Conference on Computer Vision. Sydney, NSW, Australia: IEEE, 2013: 1665-1672. [DOI:10.1109/ICCV.2013.209]

[15] Qin Y, Lu H C, Xu Y Q, et al. Saliency detection via cellular automata[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 110-119. [DOI:10.1109/CVPR.2015.7298606]

[16] Tu W C, He S F, Yang Q X, et al. Real-time salient object detection with a minimum spanning tree[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 2334-2342. [DOI:10.1109/CVPR.2016.256]

[17] Achanta R, Shaji A, Smith K, et al. SLIC superpixels compared to state-of-the-art superpixel methods[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(11): 2274–2282. [DOI:10.1109/TPAMI.2012.120]

[18] Lyu J Y, Tang Z M. Improved salient object detection based on absorbing Markov chain[J]. Journal of Nanjing University of Science and Technology, 2015, 39(6): 674–679. [吕建勇, 唐振民. 一种改进的马尔可夫吸收链显著性目标检测方法[J]. 南京理工大学学报, 2015, 39(6): 674–679. ] [DOI:10.14177/j.cnki.32-1397n.2015.39.06.007]

[19] Tong N, Lu H C, Zhang L H, et al. Saliency detection with multi-Scale superpixels[J]. IEEE Signal Processing Letters, 2014, 21(9): 1035–1039. [DOI:10.1109/LSP.2014.2323407]

[20] Wang W H, Zhou J B, Gao S B, et al. Improved multi-scale saliency detection based on HSV space[J]. Computer Engineering & Science, 2017, 39(2): 354–370. [王文豪, 周静波, 高尚兵, 等. 基于HSV空间改进的多尺度显著性检测[J]. 计算机工程与科学, 2017, 39(2): 354–370. ] [DOI:10.3969/j.issn.1007-130X.2017.02.022]

[21] Wang H L, Luo B. Saliency detection based on hierarchical graph integration[J]. Journal of Frontiers of Computer Science and Technology, 2016, 10(12): 1752–1762. [王慧玲, 罗斌. 层次图融合的显著性检测[J]. 计算机科学与探索, 2016, 10(12): 1752–1762. ]

[22] Goferman S, Zelnik-Manor L, Tal A. Context-aware saliency detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(10): 1915–1926. [DOI:10.1109/TPAMI.2011.272]

[23] Rahtu E, Kannala J, Salo M, et al. Segmenting salient objects from images and videos[C]//Proceedings of the 11th European Conference on Computer Vision. Heraklion, Crete, Greece: Springer, 2010: 366-379. [DOI:10.1007/978-3-642-15555-0_27]

[24] Xie Y L, Lu H C. Visual saliency detection based on Bayesian model[C]//Proceedings of the 18th IEEE International Conference on Image Processing. Brussels, Belgium: IEEE, 2011: 645-648. [DOI:10.1109/ICIP.2011.6116634]

[25] Duan L J, Wu C P, Miao J, et al. Visual saliency detection by spatially weighted dissimilarity[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Colorado Springs, CO, USA: IEEE, 2011: 473-480. [DOI:10.1109/CVPR.2011.5995676]

[26] Perazzi F, Krähenbühl P, Pritch Y, et al. Saliency filters: contrast based filtering for salient region detection[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Providence, Rhode Island, USA: IEEE, 2012: 733-740. [DOI:10.1109/CVPR.2012.6247743]

[27] Yang C, Zhang L H, Lu H C. Graph-regularized saliency detection with convex-hull-based center Prior[J]. IEEE Signal Processing Letters, 2013, 20(7): 637–640. [DOI:10.1109/LSP.2013.2260737]

[28] Xie Y L, Lu H C, Yang M H. Bayesian saliency via low and mid level cues[J]. IEEE Transactions on Image Processing, 2013, 22(5): 1689–1698. [DOI:10.1109/TIP.2012.2216276]

[29] Margolin R, Tal A, Zelnik-Manor L. What makes a patch distinct?[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Portland, Oregon, USA: IEEE, 2013: 1139-1146. [DOI:10.1109/CVPR.2013.151]

[30] Yang C, Zhang L H, Lu H C, et al. Saliency detection via graph-based manifold ranking[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Portland, Oregon, USA: IEEE, 2013: 3166-3173. [DOI:10.1109/CVPR.2013.407]

[31] Alpert S, Galun M, Basri R, et al. Image segmentation by probabilistic bottom-up aggregation and cue integration[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, MN, USA: IEEE, 2007: 1-8. [DOI:10.1109/CVPR.2007.383017]