1. 中国科学院合肥物质科学研究院智能机械研究所, 合肥 230031;
2. 中国科学技术大学, 合肥 230026;
3. 合肥师范学院计算机学院, 合肥 230061;
4. 合肥工业大学计算机与信息学院, 合肥 230009
 国家自然科学基金项目（913203002）；广东省省级科技计划重大科技专项基金项目（c1632611500006））；合肥师范学院校级科研基金项目（2017QN19）

# 关键词

Image saliency detection based on background-absorbing Markov chain
Jiang Fengling1,3, Zhang Haitao4, Yang Jing1,2, Kong Bin1,2
1. Institute of Intelligent Machines, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, China;
2. University of Science and Technology of China, Hefei 230026, China;
3. School of Computer Science and Technology, Hefei Normal University, Hefei 230061, China;
4. School of Computer and Information, Hefei University of Technology, Hefei 230009, China
National Natural Science Foundation of China (913203002)

# Abstract

Objective The method of saliency detection via absorbing Markov chain uses simple linear iterative cluster (SLIC) method to obtain superpixels as graph nodes. Then, a k-regular graph is constructed, and the edge weight is calculated by the difference of two nodes on CIELAB value. The superpixels are duplicated on boundary as absorbing nodes. Then, the absorbed time of transient nodes on the Markov chain is calculated. If the absorbed time is small, then the transient node is similar to the absorbing node, which is possibly a background node. On the contrary, the transient node is dissimilar to the absorbing node when the absorbed time is large. Then, the node is a salient node. Actually, the number of superpixels obtained by using the SLIC method influences the results of the saliency maps. If the size of the superpixel is extremely large, then the detailed information is ignored. On the contrary, if the size of the superpixel is extremely small, then the global cue is missed. The saliency objects usually occupy one or two boundaries of images, especially for portraits and sculptures. The final saliency results are influenced when four boundary nodes are duplicated as absorbing nodes. Considering these drawbacks, we propose an improved method based on background nodes as absorbing nodes on absorbing Markov chain. Multilayer image fusion is used to restrain the influence of the uncertain number of superpixels through the SLIC method. Method First, we confirm the edge selection. We separately obtain the four boundaries of the image nodes by SLIC method, duplicate them as absorbing nodes, and obtain four types of saliency maps. Then, we calculate the difference values of two saliency maps. The largest value of one saliency map is computed and compared with those of the other three maps. Then, we remove the most different edge and continue to use the remaining three boundary nodes to be duplicated as absorbing nodes. Then, the initial saliency map can be obtained by calculating the absorbed time of the transient nodes to absorbing nodes at the absorbing Markov chain. Second, to further optimize the algorithm, the number of absorbing nodes should be increased. In addition, the background may be a part of the boundary nodes. We add other nodes that are probably background nodes, which are selected from the initial saliency map by a threshold. If the initial saliency value of the node is lower than the threshold, then the node is considered a background node. The selected boundary and background nodes are duplicated as absorbing nodes, by which the absorbed time of the transient nodes are calculated. The pixel saliency value can be obtained from the saliency value of the superpixels. Finally, we fuse the multiple pixel saliency maps and calculate the average values as the final results. The multiple saliency values are obtained via different numbers of superpixels by using the SLIC method. Result We evaluate the effectiveness of our method on the following three benchmark datasets: ASD, DUT-OMRON, and SED. We compare our method with 12 recent state-of-art saliency detection methods, namely, MC, CA, FT, SEG, BM, SWD, SF, GCHC, LMLC, PCA, MS, and MST. ASD contains 1 000 simple images. DUT-OMRONS contains 5 168 complex images. SED includes 200 images, in which 100 images have one salient object, and the other 100 images have two salient objects. The experimental results show that the improved algorithm is efficient and better than the 12 state-of-the-art methods in precision recall curves (PR-curve) and F-measure. The precision is calculated as the ratio of actual saliency results assigned to all predicted saliency pixels. Recall is defined as the ratio of the total saliency captured pixels to the ground-truth number. F-measure is an overall performance measurement. Some visual comparative examples selected from three datasets are shown intuitively. The F-measure values of the three benchmark databases are 0.775 6, 0.903, and 0.544 7. These values are higher than that of the other 12 methods. Conclusion The image can be segmented into complex background, and the partition of human eyes is interesting. The visual image saliency detection extracts the portion of interest through the computer simulation as a human visual system. We propose an improved model, which is based on the background nodes and image fusion, to obtain the final results given the drawbacks of using four image boundaries to be duplicated as absorbing nodes and the uncertain number of superpixels through the SLIC method. The experiments show that the method is efficient and can be applicable to the bottom-up image saliency detection, especially for images such as portraits or sculptures. It can also be applied to many fields, such as image retrieval, object recognition, image segmentation, and image compression.

# Key words

object detection; SLIC method; saliency detection; Markov chain; absorbing node; multilayer image

# 0 引言

MC方法中使用简单的线性迭代聚类(SLIC)分割技术[17]获取超像素块作为马尔可夫链中的节点。另外，论文中指出，4边界超像素块一般不含有显著目标，可将4边界的超像素块复制为虚拟吸收节点，从而计算转移节点的吸收时间。根据自然图像本身的特性、拍照者的构图习惯，有很大一部分图像的显著目标会占据1~2个边界，特别是对于人像、雕塑等，单纯地将4条边界复制作为虚拟的吸收节点有待进一步提高[18]。SLIC分割结果的好坏直接影响到后续结果，可以采用多尺度等[19-20]方法进行改进。鉴于以上分析，提出了一种基于背景吸收马尔可夫链的显著区域计算模型，结果如图 1(d)所示。图 1(a)(b)分别给出了输入图像和真值图。

# 1 基于吸收马尔可夫链的显著性检测介绍

1) 图像内$m$个超像素块对应节点之间的关系：相邻超像素块对应的节点之间关联；且有相同邻居的超像素块对应的节点之间也关联；边界上的所有超像素块对应的节点之间相互关联。

2) 复制的$k$个虚拟的节点之间的关系：$k$个虚拟的节点之间相互不关联。

3) 图像内的$m$个超像素块对应的节点与复制的$k$个虚拟的节点之间的关系：假设$i$是图像内边界上的超像素块对应的节点，则复制生成新的节点$i'$，与$i$关联的所有节点均与$i'$关联，且$i $$i'之间关联。 根据上述规则，获得图 \mathit{\boldsymbol{G}} 中节点的关联矩阵 \mathit{\boldsymbol{A}}  {a_{ij}} = \left\{ {\begin{array}{*{20}{l}} {{w_{ij}}}&{j \in \mathit{\boldsymbol{N}}\left( i \right), 1 \le i \le n}\\ 1&{i = j}\\ 0&{{\rm{其他}}} \end{array}} \right. (1) 式中， \mathit{\boldsymbol{N}}\left( i \right) 表示与节点 i 相关联的节点的集合。关联节点 i$$ j$之间权重${w_{ij}}$定义为

 ${w_{ij}} = {{\rm{e}}^{\frac{{\left\| {{x_i} - {x_j}} \right\|}}{{{\sigma ^2}}}}}$ (2)

# 2.1 边界选择

 $\begin{array}{l} {h^{uv}} = \sum\limits_{i = 1}^m {\sqrt {{{(\mathit{\boldsymbol{s}}_i^u - \mathit{\boldsymbol{s}}_i^v)}^2}} } \\ u, v \in \left\{ {{\rm{top, down, left, right}}} \right\} \end{array}$ (4)

# 2.2 背景吸收节点选择

 $\mathit{\boldsymbol{C}} = \{ i|\mathit{\boldsymbol{s}}_i^{{\rm{initial}}} \le t\} \cup \left\{ {i|i \in \mathit{\boldsymbol{B}}} \right\}$ (5)

# 2.3 多层图融合

 $f_{\mathit{\boldsymbol{i}}j}^{{\rm{final}}} = \frac{{\sum\limits_{l = 1}^L {f_{ij}^{{\rm{middle}}, l}} }}{L}$ (6)

# 2.4 本文算法

1) 输入一幅图像和一些必要的参数。

2) 将输入图像使用SLIC分割，获得$m$块超像素块。

3) 根据边界选择过程，使用式(4)，选择出需要的3条边界。

4) 分$L$层进行计算，其中第$l$层为：

(1) 根据式(5)获得待复制的节点集合$\mathit{\boldsymbol{C}}$

(2) 将$\mathit{\boldsymbol{C}}$中节点进行复制，构造图，获得转移矩阵$\mathit{\boldsymbol{P}}$如式(3)所示，计算$\mathit{\boldsymbol{N}} = {\left( {\mathit{\boldsymbol{I}} - \mathit{\boldsymbol{Q}}} \right)^{ - 1}} $$\mathit{\boldsymbol{y}} = \mathit{\boldsymbol{N}} \times 1$$ \mathit{\boldsymbol{s}}\left( i \right) = \mathit{\boldsymbol{\bar y}}\left( i \right)$，获得${s^{{\rm{initial}}}}$

(3) 比较${s^{{\rm{initial}}}}$各节点的显著值与阈值$t$的关系，如果$s_i^{{\rm{initial}}} \le t$，则选择超像素块$i$对应的节点合并3边界节点进行复制，共同作为背景吸收节点；

(4) 重新构造图，重复步骤(2)，获得超像素块级的${\mathit{\boldsymbol{s}}^{{\rm{middle}}}}$

(5) 将${\mathit{\boldsymbol{s}}^{{\rm{middle}}}}$指定给超像素块内的像素点，获得像素点级的显著值${\mathit{\boldsymbol{F}}^{{\rm{middle}}}}$

5) 对$L$层获得的每个像素点的显著值均值融合，使用式(6)，获得最终的显著图${\mathit{\boldsymbol{F}}^{{\rm{final}}}}$

# 3 实验结果与分析

 ${\rm{F}} = \frac{{\left( {1 + \beta } \right)P \times R}}{{\beta \times P + R}}$ (7)

