发布时间: 2020-12-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.190675
2020 | Volume 25 | Number 12

图像分析和识别

联合聚焦度和传播机制的光场图像显著性检测

李爽, 邓慧萍, 朱磊, 张龙

武汉科技大学信息科学与工程学院, 武汉 430081

收稿日期: 2019-12-26; 修回日期: 2020-03-20; 预印本日期: 2020-03-27

基金项目: 国家自然科学基金项目（61702384，61502357）；湖北省自然科学基金项目（2015CFB365）

第一作者简介: 李爽, 1996年生, 女, 硕士研究生, 主要研究方向为图形图像处理。E-mail:hwp960825@163.com;
朱磊, 男, 副教授, 主要研究方向为显著性检测、目标检测、语义分割。E-mail:zhulei@wust.edu.cn;
张龙, 男, 硕士研究生, 主要研究方向为图形图像处理。E-mail:593419706@qq.com.

中图法分类号: TP391

文献标识码: A

文章编号: 1006-8961(2020)12-2578-09

摘要

目的图像显著性检测方法对前景与背景颜色、纹理相似或背景杂乱的场景，存在背景难抑制、检测对象不完整、边缘模糊以及方块效应等问题。光场图像具有重聚焦能力，能提供聚焦度线索，有效区分图像前景和背景区域，从而提高显著性检测的精度。因此，提出一种基于聚焦度和传播机制的光场图像显著性检测方法。方法使用高斯滤波器对焦堆栈图像的聚焦度信息进行衡量，确定前景图像和背景图像。利用背景图像的聚焦度信息和空间位置构建前/背景概率函数，并引导光场图像特征进行显著性检测，以提高显著图的准确率。另外，充分利用邻近超像素的空间一致性，采用基于K近邻法（K-nearest neighbor，K-NN）的图模型显著性传播机制进一步优化显著图，均匀地突出整个显著区域，从而得到更加精确的显著图。结果在光场图像基准数据集上进行显著性检测实验，对比3种主流的传统光场图像显著性检测方法及两种深度学习方法，本文方法生成的显著图可以有效抑制背景区域，均匀地突出整个显著对象，边缘也更加清晰，更符合人眼视觉感知。查准率达到85.16%，高于对比方法，F度量（F-measure）和平均绝对误差（mean absolute error，MAE）分别为72.79%和13.49%，优于传统的光场图像显著性检测方法。结论本文基于聚焦度和传播机制提出的光场图像显著性模型，在前/背景相似或杂乱背景的场景中可以均匀地突出显著区域，更好地抑制背景区域。

关键词

显著性检测; 光场图像; 聚焦度; 前/背景概率函数; 传播机制; 空间一致性

Saliency detection on a light field via the focusness and propagation mechanism

Li Shuang, Deng Huiping, Zhu Lei, Zhang Long

Information Science and Engineering, Wuhan University of Science and Technology, Wuhan 430081, China

Supported by: National Natural Science Foundation of China (61702384, 61502357)

Abstract

Objective Saliency detection, which has extensive applications in computer vision, aims to locate pixels or regions in a scene that attract the visual attention of humans the most. An accurate and reliable salient region detection can benefit numerous vision and graphics tasks, such as scene analysis, object tracking, and target recognition. Traditional 2D images focus on low features, including color, texture, and focus cues, to detect salient objects from the background. Although state-of-the-art 2D saliency detection methods have shown promising results, they may encounter failure in complex scenes where the foreground and background have a similar appearance or where the background is cluttered.3D images provide in-depth information that benefits saliency detection to some extent. However, most of 3D saliency detection results greatly depend on the quality of depth maps; in this way, an inaccurate depth map can greatly affect the final saliency detection result. Moreover, 3D saliency detection methods may produce inaccurate detection when the salient object cannot be distinguished at the depth level. The human visual system can distinguish regions at different depth levels by adjusting the focus of eyes. Similarly, light field has a focusing capability where a stack of images that emphasize different depth levels can be stacked. The focus cue supplied by a focal stack helps determine the background and foreground slice candidates or those conditions with certain complexities (i.e., the foreground and background having similar color/textures). Therefore, focus can improve the precision of saliency detection in challenging scenarios. The extant light field saliency detection methods verify the effectiveness of integrating light field cues, including focus, location, and color contrast cues. From the above discussion, an important aim of saliency detection for a light field is to explore the interactions and complementarities among light field cues. Method This paper builds a foreground/background probability model that can highlight salient objects and suppress the background by using location and focus cues. A propagation mechanism is also proposed to enhance the spatial consistency of the saliency results and to refine the saliency map. The focal stack and the all-focus image are taken as light field input images that are segmented into a set of non-overlapping super-pixels via simple linear iterative clustering (SLIC). First, we detect the in-focus regions of each image in the focal stack by applying the harmonic variance measure in the frequency domain to define our focus measure. Second, to determine the foreground image set and background image, we scale the focus of the focal stack by using a Gaussian filter. Saliency detection follows a common assumption that salient objects are more likely to lie at the central area and are often photographed in focus. Therefore, we analyze the distribution of in-focus objects with respect to their prior location by using a 1D band-pass Gaussian filter and then compute the foreground likelihood score of each focal slice. We choose the slice with the lowest foreground likelihood score as our background slice and the slice whose likelihood score is 0.9 times higher than the highest reported foreground likelihood score as our foreground slice. Afterward, we construct a foreground/background probability function by combing the focus of the background slice with the spatial location. We then compute the foreground cue from the foreground slice candidates and the color cue on the all-focus image. To enhance contrast, we use the foreground/background probability function to guide the foreground and color cues, which we use in turn to obtain the foreground and color saliency maps, respectively. From these maps, we find that the low saliency value of a salient object can be improved and that the high saliency value of the background area in complex scenarios (e.g., where the foreground and background are the same) can be restrained. We combine the foreground and color saliency maps by using a Bayesian fusion strategy to generate a new saliency map. We then apply a K-NN enhanced graph-based saliency propagation method that considers the neighboring relationships in both spatial and feature spaces to further optimize this saliency map. Optimizing the spatial consistency of adjacent super-pixels can uniformly highlight the saliency of objects. We eventually obtain a high-level saliency map. Result We compare the performance of our model with that of five state-of-the-art saliency models, including traditional approaches and deep learning methods, on a leading light field saliency dataset(LFSD). Our proposed model effectively suppresses the background and evenly highlights the entire saliency object, thereby obtaining sharp edges for the saliency map. We also evaluate the similarity between the predicted saliency maps and the ground truth by using three quantitative evaluation metrics, namely, canonical precision-recall curve (PRC), F-measure, and mean absolute error (MAE). Experiment results show that our proposed method outperforms all the other methods in terms of accuracy (85.16%). F-measure (72.79%), and MAE (13.49%). Conclusion We propose a light field saliency detection model that combs the focus and propagation mechanisms. Experiment results prove that our saliency detection scheme can efficiently work in challenging scenarios, such as scenes with similar foregrounds and backgrounds or complex background textures, and out performs the state-of-the-art traditional approaches in terms of precision rates in PRC and false positive rates in MAE.

Key words

saliency detection; light field image; focus; probability function of foreground/background; propagation mechanism; spatial consistency

0 引言

显著性检测是从图像中检测出最吸引人眼视觉注意的区域或对象，广泛应用在场景解析(Borji等，2019)、对象跟踪(Li等，2013a)和目标识别(Dai等，2016)等领域。视觉显著性旨在从2D、3D图像和光场图像中检测突出区域。Itti等人(1998)首次提出了计算颜色、亮度和方向等低层特征的中心—周围对比度，得到2D图像的显著图。Cheng等人(2011)用颜色对比度和空间距离加权对2D图像进行显著性检测。Jiang等人(2013)融合稀有性、聚焦度和对象性这3种特征计算显著图。袁巧等人(2018)将颜色特征的全局对比度和局部对比度，以及背景、中心和颜色这3种高层先验结合，生成显著图。此外，Lu等人(2014)提出了基于图论的2D图像显著性检测方法，得到较好的检测结果。这些方法在现有的2D数据集上获得了较高的精度，但在前景与背景颜色/纹理相似或背景杂乱的情况下表现并不理想。

相比于2D图像，3D图像加入了场景的深度信息。Peng等人(2014)提出上下文对比模型和融合框架，将现有的2D显著图与深度线索得出的显著图结合。Ren等人(2015)提出了融合背景先验、深度先验和方向先验的3D显著性检测方法。这些方法证明了深度线索在显著性检测中的重要性。然而，用边界连通性生成背景先验，当显著对象接触到图像边界时，可能无法检测到显著对象。而且，3D显著性检测结果在很大程度上取决于深度图的质量, 不精确的深度图直接影响显著性检测的正确率。另外，当前景与背景深度相近，无法区分显著区域时，3D显著性检测方法可能会增加误检率。

普通相机捕获的2D图像只记录了光线的强度，而光场相机捕获光线的方向和位置信息生成光场数据，因此光场图像具有独特的重聚焦能力，将光线重新参数化到焦平面上可以生成焦堆栈图像，这是聚焦在不同深度的一组2D图像，即每幅图像内容完全相同，只是聚焦区域不同。融合焦堆栈图像的聚焦区域可以合成一幅全聚焦图像。现有的光场图像显著性检测方法主要分为两类：1)基于光场图像进行显著性检测；2)联合光场图像与深度图进行显著性检测。Li等人(2014)提出了第一个光场图像显著性检测算法(light field saliency，LFS)，在前/背景相似或背景杂乱的情况下，相比2D和3D显著性检测表现更好。LFS不依赖于深度图，而是利用焦堆栈图像中的聚焦度线索与全聚焦图像的颜色、位置线索进行光场显著性检测。Li等人(2015)进一步提出了加权稀疏编码框架(weighted sparse coding，WSC)学习显著性/非显著性字典，可以有效处理异构类型的输入数据，包括2D、3D和光场图像，但是该模型不能充分利用3D和光场图像中包含的丰富信息。Zhang等人(2020)用深度学习探索光场图像显著性检测，提出了LF-Net(light field-networks)算法，并获得了良好的鲁棒性。另一方面，Zhang等人(2015)联合光场图像与深度图提出了DILF(deeper investigation of light field)算法，根据全聚焦图像和深度图分别计算颜色对比显著性和深度对比显著性，再运用焦堆栈图像的聚焦度信息计算背景先验，并用背景先验加权对比显著性生成显著图。Piao等人(2019)构造了一幅对象引导的深度图，并将此深度图作为引导器有效结合光场图像的对比显著性，得到显著图。虽然联合光场图像和深度图进行光场显著性检测，精度得到了明显提高, 但原始光场图像中并不包含深度图，深度图是另外通过光场图像深度估计得到的。因此，本文不直接输入深度图，而是基于光场图像进行显著性检测，从原始光场图像中提取潜在的深度信息。

以上显著性检测方法证明了在前/背景相似或背景复杂场景下整合聚焦度、颜色及空间位置线索能明显提高显著性检测准确性。而如何利用焦堆栈图像的聚焦度信息在复杂场景下有效分离图像的前景区域和背景区域，是提高显著性检测精度的核心问题。同时，发现现有方法得到的显著图存在同一物体的显著值不一致的问题，提出采用显著性传播机制增强显著图的空间一致性，以更加均匀地突出显著目标，使不精确的显著图实现较高的精度。为了进一步提高复杂场景下显著性检测的结果，本文主要工作为：1)先用高斯滤波器对焦堆栈图像的聚焦度信息进行衡量，确定前景图像集和背景图像，再用背景图像的聚焦度信息和空间位置构建前/背景概率函数，并将前/背景概率函数作为引导器，增强前景区域的显著性(由前景线索和颜色线索得出)，抑制背景区域的显著性，以提高显著性检测的精度。2)用基于K近邻法(K-nearest neighbor，K-NN)图模型的显著性传播机制优化光场图像显著图，探索相似区域的关联性，减少相似区域的差异，增强空间一致性，均匀地突出整个显著区域，得到更加精确的显著图。

1 显著性检测

本文算法流程如图 1所示。将焦堆栈图像和全聚焦图像作为输入图像，先用高斯滤波器对焦堆栈图像的聚焦度信息进行衡量，以确定前景图像集和背景图像，再分别由前景图像集和全聚焦图像得到前景特征和颜色特征，然后根据背景图像的聚焦度信息和空间位置构建前/背景概率函数，用前/背景概率函数引导前景特征和颜色特征进行显著性检测，并将前景显著图和颜色显著图通过贝叶斯框架整合得到显著图。最后，引入基于K-NN增强图的显著性传播机制优化显著图，以增强显著对象的内部相关性，从而得到更加精确的显著图。

图 1 光场图像显著性检测方法流程图

Fig. 1 The flowchart of light field saliency detection

1.1 基于聚焦度的前/背景图像建模

1.1.1 前景图像集和背景图像选取

焦堆栈图像是一组重新聚焦在不同深度的切片。在重聚焦图像中，处于聚焦深度的区域比其他深度的区域相对清晰，聚焦度值相对较大。根据重聚焦图像的聚焦度信息可以衡量不同区域之间的深度关系，深度较近的区域更可能为显著区域，而深度较远的区域大多属于背景区域。通过分析焦堆栈图像$ \{ {\mathit{\boldsymbol{I}}^{(n)}}\}, n = 1, \ldots, N$的聚焦度信息检测聚焦区域，以选取前景图像集和背景图像。空域计算聚焦度的方法只有在散焦严重时才可靠，因此，选用频域谐波方差的方法计算焦堆栈图像${\mathit{\boldsymbol{I}}^{(n)}}$的聚焦度图(Li和Porikli，2013)。${F^{(n)}}\left({x, y} \right) $表示焦堆栈图像$ {\mathit{\boldsymbol{I}}^{(n)}}$在像素$\left({x, y} \right) $处的聚焦度值。为进行稳健的估计，用简单线性迭代聚类(simple linear iterative clustering，SLIC)算法(Achanta等，2012)将全聚焦图像${\mathit{\boldsymbol{I}}^*}$和焦堆栈图像$ \left\{ {{\mathit{\boldsymbol{I}}^{(n)}}} \right\}$分割为$ m$个超像素$ p \in \{ {p_1}, {p_2}, \ldots, {p_m}\} $，并用超像素$ {p_i}$中所有像素的平均聚焦度值表示超像素$ {p_i}$的聚焦度$ {F^{(n)}}({p_i})$，即

$ {F^{(n)}}({p_i}) = \sum\limits_{(x,y) \in {p_i}} {\frac{{{F^{(n)}}(x,y)}}{{{A^{(n)}}({p_i})}}} $

(1)

式中，${A^{(n)}}({p_i}) $是超像素$ {p_i}$中像素点的总数目。

为选取前景图像集和背景图像，分析聚焦区域在图像中的位置分布，先将焦堆栈图像$ {\mathit{\boldsymbol{I}}^{(n)}}$的聚焦度值分别映射到$x $方向和$ y$方向，得到两个1维聚焦度分布$ D_x^{(n)}$和$ D_y^{(n)}$，具体为

$ \begin{array}{*{20}{l}} {D_x^{(n)} = \frac{{\sum\limits_{y = 1}^h {{F^{(n)}}} (x,y)}}{{\sum\limits_x {\sum\limits_y {{F^{(n)}}} } (x,y)}}}\\ {D_y^{(n)} = \frac{{\sum\limits_{x = 1}^w {{F^{(n)}}} (x,y)}}{{\sum\limits_x {\sum\limits_y {{F^{(n)}}} } (x,y)}}} \end{array} $

(2)

式中，$ h$，$ w$分别为焦堆栈图像$ {\mathit{\boldsymbol{I}}^{(n)}}$的高度和宽度。

感兴趣的对象通常位于图像中心位置且被聚焦拍摄，是显著性检测中的一个常见假设(Li等，2014)。因此，如果1维聚焦度分布$ D_x^{(n)}$和$ D_y^{(n)}$的两端值小，中间值大，说明焦堆栈图像$ {\mathit{\boldsymbol{I}}^{(n)}}$聚焦在图像中心，则它更大概率为前景图像，反之亦然。为了确定前景图像集和背景图像，用1维高斯带通滤波器对聚焦度分布$ D_x^{(n)}$和$ D_y^{(n)}$进行衡量，计算焦堆栈图像$ {\mathit{\boldsymbol{I}}^{(n)}}$的前景得分$ S_{{\rm{Fg}}}^{(n)}$，具体为

$ \begin{array}{*{20}{c}} {S_{{\rm{Fg}}}^{(n)} = }\\ {\rho \cdot [\sum\limits_{x = 1}^w {D_x^{(n)}} \cdot G(x,w) + \sum\limits_{y = 1}^h {D_y^{(n)}} \cdot G(y,h)]} \end{array} $

(3)

式中，$ G\left({x, w} \right) = {\rm{exp}}\left({ - {{\left({x - \left({\frac{w}{2}} \right)} \right)}^2}/2{\sigma ^2}} \right)$是高斯带通滤波器，$ \sigma $控制高斯带通滤波器的带宽，实验中$\sigma = 60。\rho = {\rm{exp}} \left( -{\frac{{\lambda {\rm{ }} \cdot n}}{N}} \right)$是焦堆栈图像${\mathit{\boldsymbol{I}}^{(n)}} $的权重因子，$ N$为焦堆栈图像总数，$ n$代表焦堆栈中第$ n$幅。根据焦堆栈成像原理，$ n$越大，焦堆栈图像聚焦的深度越远，聚焦在背景的可能性越大，则它有更大概率是背景图像。实验中$ \lambda = 0.3$。

光场图像中显著对象的深度可能包含多个层次，不能只用一幅图像表示前景图像。因此，将前景得分值大于最高前景得分0.9倍的所有图像记为前景图像集$ \{ \mathit{\boldsymbol{I}}_{{\rm{Fg}}}^l\} $，将前景得分值最低的图像记为背景图像$ {\mathit{\boldsymbol{I}}_{{\rm{Bg}}}}$，具体为

$ {\mathit{\boldsymbol{I}}_{{\rm{Fg}}}^l = \{ {\mathit{\boldsymbol{I}}^{(n)}}|S_{{\rm{Fg}}}^{(n)} > 0.9 \times {\rm{argmax}} ({S_{{\rm{Fg}}}})\} } $

(4)

$ {{\mathit{\boldsymbol{I}}_{{\rm{Bg}}}} = \{ {\mathit{\boldsymbol{I}}^{(n)}}| {\rm{argmin}} ({S_{{\rm{Fg}}}})\} } $

(5)

1.1.2 前/背景概率函数构建

在前景与背景颜色/纹理相似或背景杂乱的情况下，背景区域的显著性值可能较高，但属于非聚焦区域或距离图像中心较远的区域；前景区域的显著值可能较低，但属于聚焦区域或距离图像中心较近的区域。因此，为了抑制背景区域的显著性，增强前景区域的显著性，用聚焦度信息结合空间位置构建前/背景概率函数${W_{{\rm{Fg}}}}({p_i})$和${W_{{\rm{Bg}}}}({p_i}) $，具体为

$ {W_{{\rm{Fg}}}}({p_i}) = \exp \left( { - \frac{{{F_{{\rm{Bg}}}}({p_i})}}{{2\sigma _f^2}} \cdot \frac{{{{\left\| {\mathit{\boldsymbol{I}}_o^*({p_i}) - \mathit{\boldsymbol{o}}} \right\|}^2}}}{{\max {{\left\| {\mathit{\boldsymbol{I}}_o^*({p_j}) - \mathit{\boldsymbol{o}}} \right\|}^2}}}} \right) $

(6)

$ {W_{{\rm{Bg}}}}({p_i}) = 1 - {W_{{\rm{Fg}}}}({p_i}) $

(7)

式中，$ {F_{{\rm{Bg}}}}$表示背景图像${\mathit{\boldsymbol{I}}_{{\rm{Bg}}}} $的聚焦度图$\mathit{\boldsymbol{I}}_o^ * ({p_i}) $、$\mathit{\boldsymbol{I}}_o^ * ({p_j}) $分别为超像素对$ {p_i}$的中心位置，${p_j} $为全聚焦图像$ \mathit{\boldsymbol{o}}$的中心位置。$ \left\| {I_o^ * ({p_i}) - {\mathit{\boldsymbol{o}}^2}} \right\|$表示它们的二范数距离。实验中${\sigma _f} = 0.15 $。

1.2 显著性检测

1.2.1 颜色特征和前景特征

颜色线索是最直接、最高效计算对比度显著性的方法。而且，一个区域的显著值取决于与它周围区域的差异，距离越近的区域对显著值的贡献越高(Cheng等，2011)。因此，用全局颜色对比度差异结合位置先验在全聚焦图像上计算颜色对比度特征${\mathit{\boldsymbol{S}}_c}({p_i}) $，具体为

$ {\mathit{\boldsymbol{S}}_c}({p_i}) = \sum\limits_{j = 1}^m {{D_c}} ({p_i},{p_j}){W_o}({p_i},{p_j}) $

(8)

式中${D_c}({p_i}, {p_j}) $是超像素对${p_i} $和${p_j} $在LAB颜色空间的欧氏距离，$ {W_o}({p_i}, {p_j}) = {\rm{exp}}\left({ - \frac{{\left\| {\mathit{\boldsymbol{I}}_o^ * ({p_i}) - \mathit{\boldsymbol{I}}_o^ * ({p_j})} \right\|}}{{2\sigma _w^2}}} \right)$表示超像素对$ {p_i}$和$ {p_j}$的空间距离权重，更远的区域分配较小的权重。实验中${\sigma _w} $取0.67。

在前景与背景颜色/纹理相似或背景混杂的情况下，用颜色对比度计算显著图可能效果不佳，但通过焦堆栈图像的聚焦度信息，有助于将前景区域与相似或混杂的背景分离。在前景与背景区域深度相近的情况下，用前景先验计算显著图可能效果不理想，但可通过颜色对比度计算显著图。因此，加入前景先验与颜色对比相互补充。用前景图像集$\{ \mathit{\boldsymbol{I}}_{{\rm{Fg}}}^l\} \left({l = 1, \ldots, L} \right) $的聚焦度图$ \mathit{\boldsymbol{F}}_{{\rm{Fg}}}^l$的平均值，计算前景特征${\mathit{\boldsymbol{S}}_f}({p_i}) $，具体为

$ {\mathit{\boldsymbol{S}}_f}({p_i}) = \frac{{\sum\limits_{l = 1}^l {\mathit{\boldsymbol{F}}_{{\rm{Fg}}}^l} ({p_i})}}{L} $

(9)

1.2.2 显著性计算

为了增强对比度，用前/背景概率函数引导前景特征和颜色特征分别生成前景显著图${\mathit{\boldsymbol{S'}}_f}({p_i}) $和颜色显著图${\mathit{\boldsymbol{S'}}_c}({p_i}) $，具体为

$ \begin{array}{*{20}{l}} {\mathit{\boldsymbol{S}}_c^\prime ({p_i}) = {\mathit{\boldsymbol{S}}_c}({p_i}) \cdot {W_{{\rm{Bg}}}}({p_j})}\\ {\mathit{\boldsymbol{S}}_f^\prime ({p_i}) = {\mathit{\boldsymbol{S}}_f}({p_i}) \cdot {W_{{\rm{Fg}}}}({p_i})} \end{array} $

(10)

对颜色显著图进行显著性优化(Zhu等，2014)，获得更清晰的前景对象，然后用贝叶斯框架整合前景显著图${\mathit{\boldsymbol{S'}}_f}({p_i}) $和颜色显著图${\mathit{\boldsymbol{S'}}_c}({p_i}) $得到显著图$ {\mathit{\boldsymbol{S'}}_B}({p_i})$ (Li等，2013b)，具体为

$ {\mathit{\boldsymbol{S}}_B}(\mathit{\boldsymbol{S}}_f^\prime (z),\mathit{\boldsymbol{S}}_c^\prime (z)) = p({\mathit{\boldsymbol{F}}_f}|\mathit{\boldsymbol{S}}_c^\prime (z)) + p({\mathit{\boldsymbol{F}}_c}|\mathit{\boldsymbol{S}}_f^\prime (z)) $

(11)

式中，$ p\left({{\mathit{\boldsymbol{S}}_f}{{\mathit{\boldsymbol{S'}}}_c}\left(z \right)} \right)$是颜色显著图作为先验的后验概率，$ p\left({{\mathit{\boldsymbol{S}}_c}{{\mathit{\boldsymbol{S'}}}_f}\left(z \right)} \right)$是前景显著图作为先验的后验概率, $ Z$为像素点。

2 显著性传播

针对超像素中边缘显著性比内部更加突出造成的方块效应及同一物体的显著值却不一致问题，提出用基于K-NN图模型的显著性传播算法优化显著图，以更加均匀地突出显著区域。引入K-NN连接后，无论超像素的空间相关性如何，都可以在相似区域之间交换显著值，增强空间一致性，得到更加精确的显著值。

基于K-NN增强图的显著性传播算法(Zhu等，2017)结合邻接和K邻近超像素的相似度定义加权图模型$ G = \left\{ {V,\mathit{\boldsymbol{E}}} \right\}$, 其中$V $表示节点集合$p = \{ {p_1}, {p_2}, \ldots, {p_m}\} $，边$\mathit{\boldsymbol{E}} = \{ {e_{ij}}\} $是任意节点间的连接，是关联矩阵${\mathit{\boldsymbol{W}}_a} $和${\mathit{\boldsymbol{W}}_k} $的加权。具体计算为

$ {\mathit{\boldsymbol{E}} = (1 - \alpha ) \cdot {\mathit{\boldsymbol{W}}_a} + \alpha \cdot {\mathit{\boldsymbol{W}}_k}} $

(12)

$ {{W_a}(i,j) = \exp \left( { - {{\left( {\frac{{{D_c}({p_i},{p_j})}}{{{\sigma _1}}}} \right)}^2}} \right) \cdot {\mathit{\boldsymbol{a}}_{i,j}}} $

(13)

$ {W_k}(i,j) = \left\{ {\begin{array}{*{20}{l}} {\exp \left( { - {{\left( {\frac{{{D_c}({p_i},{p_j})}}{{{\sigma _2}}}} \right)}^2}} \right)}&{j \in {N_i}}\\ 0&{{\rm{其他}}} \end{array}} \right. $

(14)

式中，${\mathit{\boldsymbol{W}}_a} $是邻接超像素对$ ({p_i}, {p_j})$在LAB颜色空间的关联矩阵，$ {\mathit{\boldsymbol{W}}_k}$是K邻近超像素对$ ({p_i}, {p_j})$在LAB颜色空间的关联矩阵。${\mathit{\boldsymbol{a}}_{i, j}} $是一个二进制矩阵，如果超像素${p_i} $和超像素${p_j} $邻接则为1。${N_i} $是LAB颜色空间中超像素$ {p_i}$的K-NN邻近域，${\sigma _1} $与$ {\sigma _2}$控制节点间相似度。实验中$ \alpha $ =0.8。

一旦指定了图$ \mathit{\boldsymbol{E}}$，即可将显著性传播公式转化为标准的半监督学习任务。最小化下面的能量函数得到超像素$ {p_i}$的显著值$ S({p_i})$，具体为

$ \begin{array}{*{20}{l}} {\varepsilon (S({p_i})|{S_B}({p_i})) = \sum\limits_i {{k_{ii}}} {{(S({p_i}) - {S_B}({p_i}))}^2} + }\\ {{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \frac{1}{2}\sum\limits_{i,j} \mathit{\boldsymbol{E}} ({p_i},{p_j}){{(S({p_i}) - S({p_j}))}^2}} \end{array} $

(15)

式中，对角矩阵$\mathit{\boldsymbol{K}} = {\rm{diag}}\{ {k_{11}}, {k_{22}}, \ldots, {k_{nn}}\}, {k_{ii}} = \sum\limits_j {E\left({i, j} \right)} $。这个优化问题封闭形式的解为

$ S({p_i}) = {(2 \cdot \mathit{\boldsymbol{K}} - \mathit{\boldsymbol{E}})^{ - 1}} \cdot (\mathit{\boldsymbol{K}} \cdot {S_B}({p_i})) $

(16)

3 实验结果与分析

为了评估所提出方法中不同光场线索对显著性检测的影响，绘制了不同线索的查准率—查全率(precision and recall，PR)曲线，如图 2所示。可以看出，不同光场线索是相辅相成的，每个线索在显著性检测方面有其独特的优势，单独的线索都不能达到良好的效果。当组合颜色对比度和聚焦度线索时，本文方法的性能得到明显改善。在加入显著性传播机制后，显著性检测效果进一步提高，验证了本文提出的光场显著性检测方法的有效性。

图 2 不同光场线索的PR曲线

Fig. 2 PR curves of various light field cues

为了验证提出方法的有效性和先进性，将本文的实验结果与主流的光场显著性检测传统方法LFS算法(Li等，2017)、WSC算法(Li等，2015)、DILF算法(Zhang等，2015)、深度学习方法LF-Net算法(Zhang等，2020)及2D显著性检测深度学习MWS(multi-source weak supervision)算法(Zeng等，2019)模型进行对比分析。LFS、WSC、LF-Net和本文算法输入均为全聚焦图像和焦堆栈图像，DILF算法增加了深度图，MWS算法输入为全聚焦图像。所有实验基于唯一公开并广泛使用的光场图像显著性检测数据集(light field saliency dataset，LFSD)。LFSD数据集由Lytro光场相机拍摄，包括60幅室内场景和40幅室外场景图像。其中大部分图像前景背景颜色较为相似，背景也相对复杂，且部分图像含有两个及以上的显著物体。

不同显著性检测算法的视觉效果如图 3所示。本文选取了5幅测试样例图像，前4幅为单目标图像，最后1幅为多目标图像。可以看出，LFS算法在背景复杂时不能很好地抑制背景区域，且同一物体的显著值不一致(图 3(b))。WSC算法检测的显著对象不够完整，且方块效应严重(图 3(c))。DILF算法检测的显著对象方块效应严重，且对复杂背景不能很好地抑制(图 3(d))。深度学习LF-Net算法和MWS算法检测能较好地区分显著区域与非显著区域，但边缘比较模糊(图 3(e)和图 3(f))。而本文算法可以有效地抑制背景区域，均匀地突出整个显著对象，边缘也更加清晰，显著图更加符合人眼视觉感知(图 3(g))。

图 3 不同算法显著性检测结果对比图

Fig. 3 Visual comparison of various light field properties on LFSD datasets

((a) all-focus images; (b) LFS; (c) WSC; (d) DILF; (e) LF-Net; (f) MWS; (g) ours; (h) ground truth)

为了更加客观地对比实验，用PR曲线评估不同算法在整个LFSD数据集上的实验结果，但PR曲线仅考虑显著对象的显著值是否高于背景的显著值。因此，引入F-measure和MAE评价指标作为补充。

F-measure是精确度和召回度的加权谐波评估，是一项整体性度量方法，具体为

$ {F_\beta } = \frac{{(1 + {\beta ^2})P \times R}}{{{\beta ^2} \cdot P + R}} $

为突出显示F-measure的精度，实验中设置${\beta ^2} = 0.3 $ (Achanta等，2009)。

MAE为平均绝对误差，直接测量显著图与真值图的相近程度，具体为

$ MAE = \frac{1}{{w \times h}}\sum\limits_{x = 1}^w {\sum\limits_{y = 1}^h | } S(x,y) - GT(x,y)| $

式中，$ w$和$h $分别代表图像的宽度与高度。

图 4是不同算法PR曲线的对比结果。可以看出，本文算法的最小召回率值比其他方法高，因为本文算法的显著图更加平滑。在同等条件下，本文算法的PR曲线优于LFS和WSC算法。DILF算法添加了深度图，但在召回率小于0.8时，DILF算法的精度仍然低于本文算法。当召回率大于0.85时，本文算法的精度比深度学习算法LF-Net略差。

图 4 不同算法PR曲线对比结果

Fig. 4 PR curves of various light field properties

图 5和图 6分别是不同算法F-measure和MAE的对比结果。可以看出，本文算法的准确率高于所有算法。F-measure和MAE的准确率仅次于深度学习算法LF-Net和MWS。尽管深度学习在图像处理中展现了极大潜力，但深度学习需要大量的数据进行训练，且训练时间较长，同时对硬件设备要求较高。实验结果表明，本文算法能够以高置信度定位大多数显著区域，同时可以均匀突出显著对象，抑制非显著区域。

图 5 不同算法F-measure对比结果

Fig. 5 F-measure of various light field properties

图 6 不同算法MAE对比结果

Fig. 6 MAE of various light field properties

4 结论

本文提出的基于聚焦度和传播机制的光场图像显著性检测方法，先对焦堆栈图像的聚焦度信息进行衡量，得到前景特征，再通过全聚焦图像得到颜色对比度特征，并用前/背景概率函数引导光场图像的前景特征和颜色特征，以提高显著图的准确率。另外，用显著性传播机制增强显著图的空间一致性，均匀地突出整个显著区域。本文方法在一定程度上改善了在前景和背景颜色/纹理相似或杂乱背景下，背景难抑制、检测对象不完整、边缘模糊及方块效应等问题，能够均匀地突出整个显著物体，并可以有效地抑制背景区域，边缘也更加清晰。在光场图像数据集LFSD上与LFS、WSC、DILF、LF-Net和MWS方法进行比较，实验结果表明，本文方法的查准率高于对比方法，F-Measure和MAE优于传统的光场图像显著性检测方法。但是，由于通过前景特征和颜色特征进行显著性检测，当显著目标与背景区域的深度相近且颜色对比度不高时，不具备很好的鲁棒性。随着光场数据的获取越来越方便，用深度学习进行光场图像显著性检测也进入探索阶段。在光场数据足够多的情况下，深度学习可以更好地提取特征，进一步提高显著性检测结果。在今后的工作中，将考虑用深度学习对光场图像的显著性进行研究，以提高光场图像显著性检测的鲁棒性。

参考文献

Achanta R, Hemami S, Estrada F and Süsstrunk S. 2009. Frequency-tuned salient region detection//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE: 1597-1604[DOI: 10.1109/CVPR.2009.5206596]

Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Süsstrunk S. 2012. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11): 2274-2282 [DOI:10.1109/TPAMI.2012.120]

Borji A, Cheng M M, Hou Q B, Jiang H Z, Li J. 2019. Salient object detection:a survey. Computational Visual Media, 5(2): 117-150 [DOI:10.1007/s41095-019-0149-9]

Cheng M M, Zhang G X, Mitra N J, Huang X L and Hu S M. 2011. Global contrast based salient region detection//Proceedings of CVPR 2011. Providence, RI, USA: IEEE: 409-416[DOI: 10.1109/CVPR.2011.5995344]

Dai J F, Li Y, He K M and Sun J. 2016. R-FCN: object detection via region-based fully convolutional networks//Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain: Curran Associates: 379-387[DOI: 10.5555/3157096.3157139]

Itti L, Koch C, Niebur E. 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11): 1254-1259 [DOI:10.1109/34.730558]

Jiang P, Ling H B, Yu J Y and Peng J L. 2013. Salient region detection by UFO: uniqueness, focusness and objectness//Proceedings of 2013 IEEE International Conference on Computer Vision. Sydney, NSW, Australia: IEEE: 1976-1983[DOI: 10.1109/ICCV.2013.248]

Li X, Hu W M, Shen C H, Zhang Z F, Dick A. 2013a. A survey of appearance models in visual object tracking. Transactions on Intelligent Systems and Technology, 4(4) [DOI:10.1145/2508037.2508039]

Li F and Porikli F. 2013. Harmonic variance: a novel measure for in-focus segmentation//Proceedings of the British Machine Vision Conference. Bristol, UK: BMVA Press: 1-11[DOI: 10.5244/C.27.33]

Li X H, Lu H C, Zhang L H, Ruan X and Yang M H. 2013b. Saliency Detection via Dense and Sparse Reconstruction//Proceedings of 2013 IEEE International Conference on Computer Vision. Sydney, NSW, Australia: IEEE: 2976-2983.[DOI: 10.1109/ICCV.2013.370]

Li N Y, Sun B L and Yu J Y. 2015. A weighted sparse coding framework for saliency detection//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 5216-5223[DOI: 10.1109/CVPR.2015.7299158]

Li N Y, Ye J W, Ji Y, Ling H B and Yu J Y. 2014. Saliency detection on light field//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE: 2806-2813[DOI: 10.1109/CVPR.2014.359]

Li N Y, Ye J W, Ji Y, Ling H B, Yu J Y. 2017. Saliency detection on light field. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(8): 1605-1616 [DOI:10.1109/TPAMI.2016.2610425]

Lu S, Mahadevan V and Vasconcelos N. 2014. Learning optimal seeds for diffusion-based salient object detection//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE: 2790-2797[DOI: 10.1109/CVPR.2014.357]

Peng H W, Li B, Xiong W H, Hu W M and Ji R R. 2014. RGBD salient object detection: a benchmark and algorithms//Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland: Springer: 92-109[DOI: 10.1007/978-3-319-10578-9_7]

Piao Y R, Li X, Zhang M, Yu J Y, Lu H C. 2019. Saliency detection via depth-induced cellular automata on light field. IEEE Transactions on Image Processing, 29: 1879-1889 [DOI:10.1109/TIP.2019.2942434]

Ren J Q, Gong X J, Yu L, Zhou W H and Yang M Y. 2015. Exploiting global priors for RGB-D saliency detection//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Boston, USA: IEEE: 25-32[DOI: 10.1109/CVPRW.2015.7301391]

Yuan Q, Cheng Y F, Chen X Q. 2018. Saliency detection based on multiple priorities and comprehensive contrast. Journal of Image and Graphics, 23(2): 239-248 (袁巧, 程艳芬, 陈先桥. 2018. 多先验特征与综合对比度的图像显著性检测. 中国图象图形学报, 23(2): 239-248) [DOI:10.11834/jig.170381]

Zeng Y, Zhuge Y Z, Lu H C, Zhang L H, Qian M Y and Yu Y Z. 2019. Multi-source weak supervision for saliency detection//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 6067-6076[DOI: 10.1109/CVPR.2019.00623]

Zhang J, Liu Y M, Zhang S P, Poppe R, Wang M. 2020. Light field saliency detection with deep convolutional networks. IEEE Transactions on Image Processing, 29: 4421-4434 [DOI:10.1109/TIP.2020.2970529]

Zhang J, Wang M, Gao J, Wang Y, Zhang X D and Wu X D. 2015. Saliency detection with a deeper investigation of light field//Proceedings of the 24th International Joint Conference on Artificial Intelligence. Buenos Aires, Argentina: AAAI: 2212-2218

Zhu L, Ling H B, Wu J, Deng H P and Liu J. 2017. Saliency pattern detection by ranking structured trees//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 5468-5477[DOI: 10.1109/ICCV.2017.583]

Zhu W J, Liang S, Wei Y C and Sun J. 2014. Saliency optimization from robust background detection//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE: 2814-2821[DOI: 10.1109/CVPR.2014.360]