发布时间: 2020-12-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.190657
2020 | Volume 25 | Number 12

图像处理和编码

局部特定空间关系统计特征的RVIN噪声检测器

于海雯, 易昕炜, 徐少平, 林珍玉, 刘蕊蕊

南昌大学信息工程学院, 南昌 330031

收稿日期: 2019-12-06; 修回日期: 2020-05-13; 预印本日期: 2020-05-20

基金项目: 国家自然科学基金项目（61662044，61163023，51765042）；江西省自然科学基金项目（20171BAB202017）

第一作者简介: 于海雯, 1972年生, 女, 讲师, 主要研究方向为图形图像处理技术和机器视觉。E-mail:yuhaiwen@ncu.edu.cn;
易昕炜, 男, 本科生, 主要研究方向为图像处理。E-mail:6105117013@email.ncu.edu.cn;
林珍玉, 女, 硕士研究生, 主要研究方向为图像处理和机器视觉。E-mail:401030918076@email.ncu.edu.cn;
刘蕊蕊, 女, 硕士研究生, 主要研究方向为图像处理和机器视觉。E-mail:411014519042@email.ncu.edu.cn.

中图法分类号: TP391

文献标识码: A

文章编号: 1006-8961(2020)12-2494-11

摘要

目的随机脉冲噪声（random-valued impulse noise，RVIN）检测器将局部图像统计值（local image statistics，LIS）作为图块中心像素点是否为噪声的判断依据，但LIS的描述能力较弱，在不同程度上制约了RVIN检测器的检测正确率，影响了后续开关型降噪模块的修复效果。为此，提出了一种基于局部特定空间关系统计特征的RVIN噪声检测器。方法以局部中心像素点的8个邻域像素对数差值排序值（rank-ordered logarithmic difference，ROLD）并结合1个最小方向对数差值（minimum orientation logarithmic difference，MOLD）共9个反映局部特定空间关系的LIS统计值构成描述中心像素点是否为RVIN的噪声感知特征矢量，并通过在大量样本图块数据上提取的RVIN噪声感知特征矢量及其对应的噪声标签作为训练对（training pairs），训练获得一个基于多层感知网络（multi-layer perception，MLP）的RVIN噪声检测器。结果对比实验从检测正确率和实际应用效果2个方面检验所提出的RVIN检测器的有效性，分别在10幅常用图像和50幅BSD（Berkeley segmentation data）纹理图像上进行测试，并与经典的脉冲噪声降噪算法中包含的噪声检测器以及MLPNNC（MLP neural network classifier）噪声检测器相比较，以漏检数、误检数和错检总数作为评价噪声检测正确率的指标。在常用图像集上本文所提RVIN检测器的漏检数和误检数较为平衡，在错检总数上排名处于所有对比算法中的前2名，为后续的降噪模块打下了很好的基础。在BSD纹理图像集上，将本文提出的RVIN检测器和GIRAF（generic iteratively reweighted annihilating filter）算法组合构成一种RVIN噪声降噪算法（proposed-GIRAF），proposed-GIRAF算法在50幅BSD图像上的峰值信噪比（peak signal-to-noise ratio，PSNR）均值在各个噪声比例下均取得了最优结果，与排名第2的对比算法相比，提升了0.471.96 dB。实验数据表明，所提出的RVIN噪声检测器的检测正确率优于现有的检测器，与修复算法联用后即可获得一种降噪效果更佳的开关型RVIN降噪算法。结论本文提出的RVIN噪声检测器在各个噪声比例下具有鲁棒的预测准确性，配合GIRAF算法使用后，与经典的RVIN降噪算法相比，降噪效果最佳，具有很强的实用性。

关键词

图像降噪; 随机脉冲噪声; 局部空间结构关系; 8邻域对数差值排序值; 最小方向对数差值; 多层感知网络; 检测正确率

Random-valued impulse noise detector using local spatial structure statistics

Yu Haiwen, Yi Xinwei, Xu Shaoping, Lin Zhenyu, Liu Ruirui

School of Information Engineering, Nanchang University, Nanchang 330031, China

Supported by: National Natural Science Foundation of China(61662044, 61163023, 51765042)

Abstract

Objective Random-valued impulse noise (RVIN) is a common cause of image degradation that is frequently observed in images captured by digital camera sensors. In addition to degrading image quality, this type of noise also leads to pixel failure and inaccurate storage location or transmission. The presence of impulse noise may also introduce difficulties in feature extraction, target tracking, image classification, and subsequent image processing and analysis works. For RVIN, the noise value of a corrupted pixels uniformly distributed between 0 and 255. In this case, detecting the RVIN is very difficult. The available local image statistics for RVIN detection, which are used to determine whether the center pixel of an image patch is corrupted by RVIN noise or not, have are latively weak description ability, thereby restricting their accuracy to some extent and affecting the restoration performance of subsequent switching RVIN denoising modules. Method Nine local image statistics, including eight neighbor rank-ordered logarithmic difference (ROLD) statistics and one min-imum orientation logarithmic difference (MOLD) statistics, were used to construct a highly sensitive RVIN noise-aware feature vector that can describe the RVIN likeness of the center pixel of a given patch. Based on this vector, RVIN noise-aware feature vectors extracted from numerous noisy patches, their corresponding noise labels were formed as a set of training pairs for a multi-layer perception (MLP) network, and the MLP-based RVIN detector was trained. Result Comparative experiments were performed to test the estimation accuracy and denoising effect of the proposed RVIN detector. The proposed detector was compared with several state-of-the-art image denoising methods, including progressive switching median filter(PSMF), ROLD-edge preserving regularization(ROLD-EPR), adaptive switching median(ASWM), robust outlyingness ratio nonlocal means(ROR-NLM), MLP-edge preserving regularization(MLP-EPR), convolutional neural network based(CNN-based), blind convolutional neural network(BCNN), and MLP neural network classifier(MLPNNC), to demonstrate its estimation accuracy. Two image sets were used in the experiments. One image set included the "Lena", "House", "Peppers", "Couple", "Hill", "Barbara", "Boat", "Man", "Cameraman", and "Monarch" images, whereas the other set contained 50 textured images that were randomly selected from the BSD database(unlike the noise detection model training set). For a fair comparison, all competing algorithms were implemented in the MATLAB 2017b environment on the same hardware platform. To verify the estimation accuracy of the proposed RVIN detector, we applied different RVIN noise ratios to images taken from commonly used image sets, applied the proposed detector to count the instances of error, false, and missed detections for a noisy image, and compared its performance with that of existing classical RVIN noise reduction algorithms. Usually, a higher rate of error detection indicates that more noise has been left undetected in an image, and a false detection can reduce the noise of normal non-distorted pixels during the noise reduction stage, which can lead to blurry images. The total number of errors represents the number of missed and false detections, whereas a smaller number of these detections corresponds to a lower algorithm detection error rate and a better image quality after noise reduction. Experimental results show that the proposed algorithm has a relatively balanced number of missed and false detections and ranks second among all compared algorithms in this respect, thereby offering a solid foundation for the subsequent noise reduction module. In the second image set, we combined the proposed RVIN detector with the generic iteratively reweighted annihilating filter(GIRAF) algorithm to form a RVIN noise reduction algorithm. To verify the effectiveness of the proposed detector, we applied different ratios of RVIN noise (i.e., 10%, 20%, 30%, 40%, 50%, and 60%) to 50 textured images and recorded the average peak signal-to-noise ratio (PSNR) of these images under each noise ratio. Experimental results show that the images restored by the proposed-GIRAF algorithm achieve the optimal PSNR under each noise ratio and that this algorithm greatly outperforms the Xu, Chen-GIRAF, and MLPNNC-GIRAF algorithms. The proposed-GIRAF algorithm also outperforms the second-best algorithm by 0.47 dB to 1.96 dB in terms of the average PSNR of its 50 images, thereby suggesting that the actual detection results of the proposed noise detector are the most effective for the subsequent noise reduction module. Experimental results also show that the proposed RVIN detector outperforms most of the existing detectors in terms of detection accuracy. As such, a switching RVIN removal method with an improved denoising performance can be obtained by combining the proposed RVIN detector with any inpainting algorithm. Conclusion Extensive experiments show that the estimation accuracy of the proposed MLP-based noise detector is robust across a wide range of noise ratios. When combined with the GIRAF algorithm, this detector significantly outperforms the traditional RVIN denoising algorithm in terms of denoising effect.

Key words

image denoising; random-valued impulse noise (RVIN); local spatial structure; eight neighbor rank-ordered logarithmic difference (EN-ROLD); minimum orientation logarithmic difference (MOLD); multi-layer perception (MLP); detection accuracy

0 引言

数字图像往往因摄像机传感器像素故障、存储位置错误或者在噪声信道中传输而引入脉冲噪声(impulse noise, IN)，导致图像质量下降，进而给特征提取(Nie等，2019)、目标跟踪(李康等，2018)和图像分类(Guo等，2018)等后续图像处理和分析工作带来困难。一般脉冲噪声分为固定脉冲噪声(fixed-valued impulse noise，FVIN)和随机脉冲噪声(random-valued impulse noise，RVIN)两种(Jin和Ye，2018；Xu等，2018a)，本文主要研究检测难度相对更大的RVIN噪声的检测问题。

研究者在分析局部窗口中的所有像素点与中心像素点之间的统计规律后，提出了一类基于局部图像统计值(local image statistic, LIS)的RVIN噪声检测方法(Garnett等，2005；Dong等，2007；Xu和Tan，2014；Liu等，2015)。Dong等人(2007)对ROAD(rank-ordered absolute difference)统计值(Garnett等，2005)进行对数变换，放大噪声像素与无噪声像素之间的差异，提出了对数差值排序(rank-ordered logarithmic difference，ROLD)统计值；并采用迭代方式提高检测正确率，从而提高降噪性能。ROLD-EPR(ROLD-edge preserving regularization)降噪算法的性能其实严重依赖于预设阈值的设置，且在每次迭代过程中受制于EPR(edge preserving regularization)降噪算法性能，需要优化调整的参数较多，执行效率比较低。Xu等人(2018b)在ROLD的基础上进行改进，对中心像素点与其邻域像素点之间的绝对差值应用分段幂函数进行计算并排序，将前$m$个最小值记为ROPD(rank-ordered power difference)统计值。ROPD统计值更好地放大了噪声像素与无噪声像素之间的差异，能够检测出更多ROLD检测器无法检测出的噪声点。总之，上述LIS统计值在计算时仅考虑了局部窗口内的像素点亮度值之间的关系，未对像素点在特定空间位置上的分布关系加以利用。基于这些LIS统计值构建的RVIN噪声检测器都需要设置合理阈值，并通过多次迭代才能获得较高的噪声检测准确率。然而，由于自然图像的内容非常丰富，这些LIS统计值描述能力有限，导致所构建的RVIN噪声检测器的执行效率比较低。

为了克服上述基于LIS统计值的RVIN噪声检测算法的缺陷，采用机器学习方法构建噪声检测器(noise detector)，这些噪声检测器直接利用从图块中提取的各种LIS特征值作为输入，基于预训练获得的检测模型判定中心像素点是否为RVIN噪声。Turkmen(2016)以ROAD和ROLD两个统计特征值作为多层感知(multi-layer perception，MLP)神经网络的输入，对应的噪声标签作为输出训练噪声检测器。Roy等人(2016)提出利用预测误差、局部窗口中心像素点亮度值、局部窗口内像素点亮度中值、局部窗口内像素点亮度均值以及中心像素点值亮度值与中值的绝对差值共5个特征值作为支持向量机(support vector machine, SVM)的输入构建FVIN噪声检测器。Soleimany和Hamghalam(2017)从图块中提取GD(gray-level difference)、ABD(average background difference)、ACD(accumulation complexity difference)、ROLD和ROAD共5个特征，利用多层感知网络实现了RVIN噪声的检测。Kumar和Nagaraju (2018)将噪声图像中的预测误差、像素亮度中值等10个统计特征值构成输入矢量，并利用支持向量机将其映射为噪声标签。这类基于机器学习RVIN噪声检测器的执行效率优于传统迭代型的统计值—阈值比较方法，但Kumar和Nagaraju(2018)、Roy等人(2016)、Soleimany和Hamghalam(2017)以及Turkmen(2016)都只是将现有若干个LIS统计特征值简单组合作为网络模型的输入，局部窗口内像素点之间的空间关系仍然未予以重视，所构建的RVIN噪声检测器的检测正确率比传统迭代型统计值—阈值方法没有显著优势，需要进一步提高。

对于基于训练策略实现的RVIN噪声检测器来说，输入到检测网络模型的各种LIS统计特征的描述能力在很大程度上决定了检测正确率。所采用的LIS特征值描述RVIN噪声的能力越好，则检测精度越高。为此，本文提出了一种基于局部特定空间关系统计特征的RVIN噪声检测器。具体地，在局部窗口内计算中心像素点的8个邻域像素点的ROLD值，即8邻域ROLD统计值(eight neighbor ROLD, EN-ROLD)，更精细地描述中心像素点与上、下、左、右、左上、右上、左下和右下的像素点之间的关系。此外，为了提高对图像边缘上噪声像素点的检测正确性，将描述水平、垂直、左斜对角线和右斜对角线边缘特性的MOLD(minimum orientation logarithmic difference)统计特征值考虑进来，8邻域ROLD特征值与MOLD特征值相结合后构成RVIN噪声感知特征矢量。最后，利用具有非线性映射能力强大的MLP神经网络快速实现从RVIN噪声感知特征矢量到噪声标签(用0标记无失真像素，1标记噪声像素)的映射。实验数据表明，所提出的RVIN噪声检测器的检测能力优于参与对比的主流RVIN检测器，配合修复(inpainting)算法使用后可获得更好的降噪性能。

1 ROLD简介

1.1 ROLD统计值

ROLD(Dong等，2007)统计值的计算过程如下：在大小为$(2 N+1)×(2 N+1)$的局部窗口中，假设中心像素点的坐标为$(i, j)$，则该局部窗口所有像素点的坐标集合为

$ {\mathit{\boldsymbol{ \boldsymbol{\varOmega} }}_{(i,j)}}(N) = \{ (i + s,j + t)| - N \le s,t \le N\} $

(1)

记$\mathit{\boldsymbol{ \boldsymbol{\varOmega} }}_{(i,j)}^0(N) = \left\{ {{\mathit{\boldsymbol{ \boldsymbol{\varOmega} }}_{(i,j)}}(N)\backslash (i,j)} \right\}$为中心像素点的邻域像素坐标集合，“\ ”表示去除集合中坐标为$(i, j)$的点。其各个像素点亮度值$y_{i+s, j+t}$与中心像素点亮度值$y_{i, j}$之间的对数绝对差值为

$ \begin{array}{*{20}{c}} {{{\tilde D}_{s,t}}({y_{i,j}}) = {{\log }_a}|{y_{i + s,j + t}} - {y_{i,j}}|}\\ {\forall (s,t) \in \mathit{\boldsymbol{ \boldsymbol{\varOmega} }}_{(i,j)}^0(N)} \end{array} $

(2)

对于任意$a>1$，都有$\widetilde{D}_{s, t}(y_{i, j})∈(-∞, 0]$。为了使得$\widetilde{D}_{s, t}(y_{i, j})$的取值在$[0, 1]$之间，对其进行归一化处理，具体为

$ \begin{array}{*{20}{c}} {{D_{s,t}}({y_{i,j}}) = 1 + \max \{ {{\log }_a}|{y_{i + s,j + t}} - {y_{i,j}}|, - b\} /b}\\ {\forall (s,t) \in \mathit{\boldsymbol{ \boldsymbol{\varOmega} }}_{(i,j)}^0(N)} \end{array} $

(3)

式中，$a=2$，$b=5$为Dong等人(2007)方法中给定的最佳参数设置。将$D_{s, t}(y_{i.j})$按从小到大的顺序排列，记第$k$个最小的$D_{s, t}(y_{i.j})$为$R_{k}(y_{i, j})$，那么ROLD统计值最终被定义为前$m$个最小对数绝对差值之和，即

$ ROL{D_m}({y_{i,j}}) = \sum\limits_{k = 1}^m {{R_k}} ({y_{i,j}}) $

(4)

式中，$m$的值通常设置为窗口内像素点个数的一半。ROLD脉冲噪声检测器的基本原理是将从图块中心像素点上提取的ROLD值与预设阈值$T$相比较，如果$ROLD_{m}(y_{i, j})>T$，则该当前窗口中心像素点被判定为噪声，否则为无失真像素点。

1.2 现有问题

为了说明ROLD统计值的局限性，对无失真Lena图像施加40 %的RVIN噪声，从图像平滑区域选取一个大小为5×5的图块$PA$，从包含边缘细节的区域选取另一个大小为5×5的图块$PB$，为了更好地显示像素点亮度值之间的空间分布关系，放大后的像素点以相应的亮度值填充，如图 1所示。

图 1 噪声图像中不同位置的两个图块中心像素点的ROLD统计值比较

Fig. 1 Comparison of the ROLD statistics of two pixels centered at two image patches((a) clear image; (b) noisy image)

图 1(a)显示了两个无失真图块的像素亮度值，图 1(b)则给出了两个图块受RVIN噪声污染后的像素值及图块中心像素点的ROLD统计值，图中红色为中心像素点。对比图 1(a)和图 1(b)可以发现，图块$PA$的中心像素点未受噪声干扰，而图块$PB$中心像素点为噪声点，但是这两个中心像素点的ROLD值非常接近且比较大(分别为4.423 8和4.351 3)，仅依赖ROLD—阈值检测机制会将两个像素点均判定为噪声。因此，利用ROLD统计值与阈值相比较来检测脉冲噪声的方法需要改进，必须引入更多描述能力强的LIS统计值和更为复杂的检测机制。

2 改进噪声检测器

2.1 改进思路

由式(4)可知，ROLD统计值仅考虑了局部窗口内像素点亮度值之间的关系，未考虑这些亮度值的空间分布关系，而这些亮度值在空间上的特定组合组成了丰富的图像纹理细节，故经典ROLD统计值的描述能力有限。为了提高LIS特征值的描述能力，进而提高脉冲噪声检测的正确率，本文利用中心像素点与其邻域像素点之间特定的空间关系改进现有ROLD统计值，以8个邻域像素的ROLD值(即EN-ROLD)和1个边缘MOLD特征值共同构成描述中心像素点是否为噪声的RVIN噪声感知特征矢量。

2.2 EN-ROLD统计值

为了提高ROLD统计值对中心像素点是否为RVIN噪声的描述能力，本文在当前大小为5×5的局部窗口(Dong等(2007)获得最佳检测正确率的配置)内分别以中心像素点的8个邻域(上下左右和对角线共8个)像素点为中心选取大小为3×3的子窗口，计算每个邻域像素点对应的ROLD值(均包含中心像素点)，以这8个ROLD值构成所谓的8邻域EN-ROLD统计特征值，具体如图 2所示。在大小为5×5的窗口${\mathit{\boldsymbol{ \boldsymbol{\varOmega} }}_0}$中，中心像素点$(i, j)$的亮度值为$x_{i, j}$，其邻域像素点对应的8个大小为3×3的子窗口记为$\mathit{\boldsymbol{ \boldsymbol{\varOmega} }}_{k}(k=1, 2, …, 8$)，依次计算每个子窗口中心像素点的ROLD值，记为EN-ROLD$_{k}$，它们组合在一起的描述能力比单一的ROLD值更强。

图 2 EN-ROLD示意图

Fig. 2 Schematic diagram of EN-ROLD

仍以图 1中的2个图块$PA$和$PB$为例，分别计算它们各自8个子窗口中心像素点的ROLD值，结果如表 1所示。从表 1可知，图块$PA$中心像素点的8个邻域像素的ROLD值大部分在1.0以下，而图块$PB$的中心像素点的8个邻域像素的ROLD值都在1.0以上，区别非常显著。而使用单一的ROLD统计值(分别为4.423 8和4.351 3)则无法对此进行有效区分。因此，考虑了空间位置的EN-ROLD统计值对RVIN噪声描述能力比ROLD统计值要强。

表 1 图块$PA$和$PB$的子窗口中心像素点的ROLD值
Table 1 ROLD values extracted from the eight subwindows of the patches $PA$ and $PB$

下载CSV

EN-ROLD$_{k}$	$PA$	$PB$
$k=1$	0.58	1.03
$k=2$	1.17	1.03
$k=3$	1.32	1.08
$k=4$	0	1.13
$k=5$	0.82	1.07
$k=6$	0	1.42
$k=7$	0.69	1.06
$k=8$	0.83	1.06

2.3 MOLD统计值

虽然提出的EN-ROLD统计特征值的描述能力比较强，但对图像纹理细节丰富区域的描述仍然不够理想。为了解决这一问题，引入描述图像边缘特征的统计值来进一步区分噪声点和边缘像素点。在一个大小为$(2 N+1)×(2 N+1)$的局部窗口中，首先分别计算中心像素点与其水平、垂直、左斜对角线和右斜对角线共4个方向上的邻域像素点之间的差值$d^{\rm{h}}_{n}$，$d^{\rm{v}}_{n}$，$d^{\rm{l}}_{n}$，$d^{\rm{r}}_{n}$，具体为

$ {d_n^{\rm{h}} = |{y_{i,j}} - {y_{i,(j - N + n)}}|} $

(5)

$ {d_n^{\rm{v}} = |{y_{i,j}} - {y_{(i - N + n),j}}|} $

(6)

$ {d_n^{\rm{l}} = |{y_{i,j}} - {y_{(i - N + n),(j - N + n)}}|} $

(7)

$ {d_n^{\rm{r}} = |{y_{i,j}} - {y_{(i - N + n),(j + N - n)}}|} $

(8)

式中，$1≤n≤2 N$。然后对每个方向上的差异进行累加求和，最后对4个方向上差值累积和的最小值作对数变换，以此作为描述图像边缘特性的统计特征值(minimum orientation logarithmic difference，MOLD)，即

$ \begin{array}{*{20}{c}} {MOLD = }\\ {{{\log }_2}(\min (\sum\limits_{n = 1}^{2N} {d_n^{\rm{h}}} ,\sum\limits_{n = 1}^{2N} {d_n^{\rm{v}}} ,\sum\limits_{n = 1}^{2N} {d_n^{\rm{l}}} ,\sum\limits_{n = 1}^{2N} {d_n^{\rm{r}}} ) + 1)} \end{array} $

(9)

理论上，如果窗口中心像素点的MOLD统计值较小，那么就可以认为该像素点为正常的边缘像素点。仍以图 1中的图块$PA$和$PB$为例，可以看出图块$PA$的中心像素属于边缘像素点，图块$PB$的中心像素点则位于平滑区域，这两个图块中心像素点的MOLD边缘统计特征值分别为0.32和1.16，具有显著差别。因此，MOLD特征值可以用来辅助判断局部窗口中心像素点是正常的图像边缘像素还是脉冲噪声。

2.4 噪声检测器

与Turkmen(2016)的方法类似，本文利用MLP神经网络，将从图块上提取的RVIN噪声感知特征矢量直接映射为噪声标签。为了训练该检测模型，首先从BSD(Berkeley segmentation data)数据库(Arbeláez等，2011)中选取若干幅原始无失真图像并对每幅图像施加不同比例(0~60 %，间隔1 %)的随机脉冲噪声，然后从每幅噪声图像上提取若干个图块，构成具有$m$个图块的训练图块集合，将第$i$个训练图块记为$P_{i}$，从每个图块中提取其中心像素点的8个EN-ROLD特征以及MOLD边缘特征，共9个RVIN噪声感知特征，构成能够准确描述中心像素点是否为噪声的特征矢量，即

$ \begin{array}{l} {\mathit{\boldsymbol{F}}_i} = (EN - ROL{D_1},EN - ROL{D_2}, \cdots ,\\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} EN - ROL{D_8},MOLD) \end{array} $

(10)

式中，$EN-ROLD_{j}$($j=1, 2, …, 8$)表示每个图块$\mathit{\boldsymbol{P}}_{i}$中提取其中心像素点的8个EN-ROLD特征，$MOLD$为边缘特征。同时记录每个图块$\mathit{\boldsymbol{P}}_{i}$中心像素点对应的噪声标签$L_{i}∈{0, 1}$，1表示噪声像素点，0表示干净像素点，将所有图块的特征矢量和对应的噪声标签构成网络模型的输入输出对${(\mathit{\boldsymbol{F}}_{1}, L_{1}), (\mathit{\boldsymbol{F}}_{2}, L_{2}), …, (\mathit{\boldsymbol{F}}_{n}, L_{m})} \subset \mathit{\boldsymbol{F}}×L$。式中，$(\mathit{\boldsymbol{F}}_{1}, L_{1})$，$(\mathit{\boldsymbol{F}}_{2}, L_{2})$和$(\mathit{\boldsymbol{F}}_{n}, L_{m})$表示特征矢量与噪声标签构成的输入输出对。

用于训练噪声检测模型的MLP网络一共有2层隐含层，其输入层为特征矢量$\mathit{\boldsymbol{F}}$，输出层为噪声标签$L$，如图 3所示，图中每个圆圈代表一个神经元，有“+1”标识的圆圈代表偏置节点。第$k$个网络层的第$i$个神经元$Y^{\rm{(k)}}_{i}$用数学语言描述为

图 3 基于MLP的噪声检测器结构

Fig. 3 The architecture of MLP-based RVIN detector

$ Y_i^{(k)} = f({\mathit{\boldsymbol{W}}^{(k)}}{\mathit{\boldsymbol{Y}}^{(k - 1)}} + {\mathit{\boldsymbol{b}}^{(k)}}) $

(11)

式中，$f(·)$为激活函数，$\mathit{\boldsymbol{W}}^{(k)}$为连接第$k-1$层和第$k$层神经元的权重，$\mathit{\boldsymbol{Y}}^{(k-1)}$为第$k-1$个隐藏层的输出并作为第$k$个网络层的输入，$\mathit{\boldsymbol{b}}^{(k)}$为第$k$个网络层的偏置。本文使用随机梯度下降(stochastic gradient descent, SGD)算法进行训练。

2.5 检测器工作流程

对于给定$w×h$($w$为宽，$h$为长)的噪声图像$\mathit{\boldsymbol{I}}$，首先以光栅扫描的方式从中提取$n=w×h$个大小为5×5的噪声图块(扩展图像边缘处像素点后，可以获得与原图像素点个数相同的图块数)。其次，提取每个图块的8个EN-ROLD特征和MOLD特征构成特征矢量。本文提出的RVIN检测器通过使用训练好的MLP噪声检测模型，将所有的特征矢量映射为对应的噪声标签，最终根据光栅提取的逆顺序还原生成与噪声图像大小一致的噪声标签矩阵。根据该标签矩阵，后续的修复算法可以很方便地决定是否对图像中的某个像素点启动修复过程。

ROLD统计值的计算包括过程P操作、M操作和Q操作。P操作指计算中心像素点与窗口内其他各像素点之间的距离，M操作指对所有距离值排序，Q操作指对前若干个最小距离值的进行累加。且分别将完成一次P操作的执行时间记为$n_{\rm{P}}$，完成一次M操作的执行时间记为$n_{\rm{M}}$，完成一次Q操作的执行时间记为$n_{\rm{Q}}$。一个ROLD统计值(窗口大小为5×5)的计算过程如下：1)计算中心像素点与窗口内其他所有像素点之间的距离，共需完成24次P操作；2)对距离值排序，共需完成${\rm{log}}_{2}24$次M操作；3)取前12个最小距离值进行累加，共需完成11次Q操作。因此，ROLD统计值的复杂度为$\mathrm{O}\left(24 n_{\mathrm{P}}+24 \log _{2} 24 n_{\mathrm{M}}+11 n_{\mathrm{Q}}\right)$。

另一方面，计算1个EN-ROLD统计值时，局部窗口大小为3×3，所以共需要完成8次P操作、${\rm{log}}_{2}$8次M操作和3次Q操作，则计算8个EN-ROLD特征值的复杂度为$\mathrm{O}\left(64 n_{\mathrm{P}}+64 \log _{2} 8 n_{\mathrm{M}}+24 n_{\mathrm{Q}}\right)$。计算1个MOLD特征值，在一个方向上需要完成4次P操作和3次Q操作，MOLD特征值的计算复杂度为$\mathrm{O}(16n_{\mathrm{P}}+12n_{\mathrm{Q}})$(4个方向上的计算总量)。因此，本文提出的特征矢量(8个EN-ROLD特征值和1个MOLD特征值)的复杂度为$\mathrm{O}\left(80 n_{\mathrm{P}}+64 \log _{2} 8 n_{\mathrm{M}}+36 n_{\mathrm{Q}}\right)$，其计算量约为经典ROLD统计值的3倍多。其实，Dong等人(2007)提出的ROLD-EPR算法需要通过多次迭代的方式提高检测正确率从而提高降噪性能，其噪声检测与降噪模块紧密结合在一起。在修复过程中需要对整幅图像中的噪声像素点进行反复(典型情况下达到6~8次)检测，故ROLD-EPR算法在噪声检测部分所需的实际计算量大约是所提出方法的2倍多。另外，与采用特征值+机器学习的噪声检测器相比(Roy等，2016；Kumar和Nagaraju，2018)，所提出的RVIN噪声检测器的执行效率要高出很多，因为在特征值提取的执行效率方面具有显著优势。基于MLP网络的RVIN噪声检测器的工作流程如图 4所示。

图 4 基于MLP网络的RVIN噪声检测器的工作流程

Fig. 4 The pipeline of the proposed RVIN noise detector based on MLP network

3 实验与分析

3.1 测试环境

为了评估所提出的RVIN检测器的性能，与经典的PSMF(progressive switching median filter)(Zhou和David，1999)、ROLD-EPR(Dong等，2007)、ASWM(adaptive switching median)(Akkoul等，2010)、ROR-NLM(robust outlyingness ratio nonlocal means)(Xiong和Yin，2012)、MLP-EPR(multi-layer perception-edge preserving regularization)算法(Turkmen，2016)以及CNN-based(convolutional neural network based)(Xu等，2018a)和BCNN(blind convolutional neural network)(Chen等，2019)提出的脉冲噪声降噪算法中包含的噪声检测器以及MLPNNC(MLP neural network classifier)(Soleimany和Hamghalam，2017)噪声检测器相比较，将漏检数、误检数和错检总数作为评价噪声检测正确率的指标，从检测正确率和实际应用效果2个方面进行验证。实验在10幅各类文献常用的图像(图 5)和50幅BSD纹理图像(Arbeláez等，2011)上进行，在硬件为Inter(R) Core(TM) i7-3770 CPU @ 3.40 GHz RAM 16 GB，软件为Windows10.0操作系统、MATLAB R2017b的统一环境下完成。

图 5 各类文献中的常用图像

Fig. 5 Commonly used images in the literature

((a)Peppers; (b)Monarch; (c)Man; (d)Lena; (e)House; (f)Hill; (g)Couple; (h)Cameraman; (i)Boat; (j)Barbara)

3.2 检测准确性

为验证提出的RVIN噪声检测器的检测准确性，对原始无失真的图像均分别施加比例为20 %、40 %和60 %的RVIN噪声，统计各个脉冲噪声检测器在每幅噪声图像上的漏检数、误检数和错检总数，对比数据(限于篇幅，仅给出了Lena图像(图 5(d))上的实验数据)如表 2所示。通常情况下，漏检率高意味着图像中仍有较多的噪声未被检测出来，误检则会导致在降噪阶段对正常的无失真像素点执行降噪过程，使得图像模糊化。错检总数是漏检数和误检数的和，该值越小意味着检测错误率越低，检测正确率越高，意味着降噪后的图像质量会更好。由表 2可知，本文所提噪声检测器的漏检数和误检数较为平衡，在错检总数指标上与CNN-based的算法一起排名，处于所有算法中的前2名，为后续的降噪模块打下了很好的基础。需要说明的是，由于漏检数和误检数对后续图像降噪模块影响的方式不同，故本文所提出的检测算法和CNN-based的算法的RVIN噪声检测结果最终对降噪效果的影响还可进一步通过实际降噪效果的对比进行分析。

表 2 各噪声检测器在Lena图像上的各项性能指标对比
Table 2 Comparison of performance indexes of each noise detector on Lena image

下载CSV

方法	20% RVIN噪声			40% RVIN噪声			60% RVIN噪声
方法	漏检数	误检数	错检总数	漏检数	误检数	错检总数	漏检数	误检数	错检总数
PSMF	16 383	1 181	17 564	33 635	2 005	35 640	55 607	4 565	60 172
ROLD-EPR	6 828	7 403	14 231	11 288	9 885	21 173	12 455	12 778	25 233
ASWM	4 269	7 049	11 318	9 161	7 741	16 902	17 991	8 839	26 830
ROR-NLM	6 336	4 924	11 260	15 558	5 554	21 112	31 701	10 984	42 685
MLP-EPR	10 791	1 441	12 232	17 470	4 274	21 744	20 784	9 505	30 289
CNN-based	5 285	1 170	6 455	9 821	4 215	14 036	14 692	8 830	23 522
BCNN	5 821	6 573	12 394	11 430	7 407	18 837	16 478	8 620	25 098
MLPNNC	10 344	528	10 872	18 699	3 569	22 268	20 375	16 174	36 549
proposed-GIRAF(本文)	2 560	5 498	8 085	11 062	4 848	15 910	15 146	9 650	24 796
注：加粗字体和下划线字体分别表示各列最优和次优结果。

3.3 实际应用效果

为了验证所提出的噪声检测器的实际应用效果，将其与GIRAF(generic iteratively reweighted annihilating filter)算法(Ongie和Jacob，2016)组合构成一种RVIN噪声降噪算法(称为proposed-GIRAF)，并增加ALOHA(annihilating filter-based low-rank Hankel matrix)(Jin和Ye，2018)算法参与对比测试。ALOHA算法是基于矩阵补全技术实现的RVIN降噪算法，其噪声检测和降噪模块并未进行分离，故本文仅用其对比降噪效果。从BSD数据库(Arbeláez等，2011)中选取50幅纹理丰富的无失真图像作为测试集，对纹理图像集中的每幅图像施加10 % ~60 %、间隔10 %的RVIN噪声，使用所有参与对比的降噪算法对每幅噪声图像进行复原，记录在各个噪声比例下降噪后图像的峰值信噪比(peak signal-to-noise ratio，PSNR)值的均值，如表 3所示。从表 3可以看出，proposed-GIRAF算法所复原图像的PSNR值在各个噪声比例下均取得了最优结果，比CNN-based的算法、BCNN-GIRAF和MLPNNC-GIRAF算法高很多，这意味着所提出的RVIN噪声检测器实际的检测结果对后继降噪模块是最为有效的。

表 3 各降噪算法在纹理图像集的50幅噪声图像上的PSNR均值
Table 3 Average PSNR results of each denoising algorithm on 50 textured noisy images

下载CSV

/dB
方法	噪声比例
方法	10%	20%	30%	40%	50%	60%
PSMF	28.84	26.75	24.66	22.42	20.11	17.88
ROLD-EPR	30.15	27.87	26.49	25.81	24.88	23.85
ASWM	28.25	27.37	26.43	25.33	23.75	21.36
ROR-NLM	26.79	26.57	25.63	24.72	22.97	20.76
MLP-EPR	30.01	27.63	26.31	25.23	24.14	23.19
ALOHA	31.63	29.02	26.92	24.52	22.48	20.41
CNN-based	31.31	28.85	27.22	25.78	24.66	23.68
BCNN-GIRAF	28.20	27.10	26.13	25.19	24.21	22.80
MLPNNC-GIRAF	32.99	30.05	28.44	25.90	23.83	21.98
proposed-GIRAF(本文)	33.89	31.06	28.91	27.25	25.64	23.94
注：加粗字体表示各列最优结果。

为了区分上述降噪实验中RVIN噪声检测模块和降噪模块在总体降噪效果中的贡献大小，本文在10幅常用图像上进行测试。使用所提出的RVIN检测器结合inpainting修复算法(proposed-inpainting)(Chan等，2017)、真实噪声标签结合GIRAF算法(truth-GIRAF)(Ongie和Jacob，2016)和使用所提出的RVIN检测器结合GIRAF算法(proposed-GIRAF)分别对添加了各噪声比例的测试集图像进行降噪，并记录10幅图像的PSNR均值，结果如表 4所示。可以看出，proposed-GIRAF仅在低噪声比例情况下比使用proposed-inpainting算法有一定优势，PSNR指标提升了1.47 dB，随着噪声比例增高，这种优势逐渐消失。inpainting和GIRAF都是性能不错的修复算法，它们之间的性能有差异，但不是很大。相对来说，GIRAF更好一些，所以本文选用GIRAF算法完成对检测出的噪声进行降噪的任务。另一方面，通过对比truth-GIRAF和proposed-GIRAF可以看出，truth-GIRAF算法在各个噪声比例条件下10幅图像的PSNR均值比proposed-GIRA高6.5~7.4 dB，优势显著，主要因为truth-GIRAF算法使用的噪声标签是完全正确的缘故。说明噪声检测器的检测正确率越高，对降噪效果的提升越有帮助。在整个降噪效果的提升贡献中，噪声检测正确率的提高起到了更多的作用。结合表 2可知，本文提出的RVIN检测器的检测正确率在各噪声比例下都相对较高，通过与GIRAF搭配使用后，降噪效果最终在所参与对比的算法中是最好的。总之，proposed-GIRAF算法的优良降噪效果主要取决于噪声检测器的检测正确率，后续降噪算法(GIRAF)起次要作用。

表 4 在10幅常用图像上的PSNR均值
Table 4 Average PSNR performance on ten commonly used images

下载CSV

/dB
算法	噪声比例
算法	10%	20%	30%	40%	50%	60%
proposed-inpainting	34.92	32.33	30.10	28.27	26.57	24.82
truth-GIRAF	43.76	40.04	37.53	35.40	33.35	31.27
proposed-GIRAF(本文)	36.39	33.30	30.64	28.66	26.68	24.76
注：加粗字体表示各列最优结果。

4 结论

本文充分利用中心像素点与其邻域像素点之间的局部特定空间关系，使用局部窗口内的8邻域ROLD统计值和1个MOLD边缘统计特征值构成的特征矢量，提高了传统LIS特征值对RVIN噪声的描述能力。然后基于非线性映射能力强大的MLP网络，通过训练得到了一种检测正确率更高的RVIN噪声器。实验数据表明，所提出的由9个LIS特征值构成的RVIN噪声感知特征矢量，充分利用了局部窗口内像素点之间特定的空间分布统计关系，能更加精细地描述中心像素点是否被RVIN噪声干扰。构建的RVIN噪声检测器的检测正确率得以提高，为后续开关型RVIN噪声降噪任务打下了很好的基础。

与经典的RVLN噪声监测器相比，本文方法在错检总数指标上排名第1，漏检数和误检数排名前三，同时在执行效率上也取得不错结果。未来将考虑从以下两方面入手：1)使用描述能力更强的统计特征值；2)构建映射能力更强的神经网络，进一步提高RVIN噪声检测器的预测准确性和执行效率。

参考文献

Akkoul S, Ledee R, Leconge R, Harba R. 2010. A new adaptive switching median filter. IEEE Signal Processing Letters, 17(6): 587-590 [DOI:10.1109/LSP.2010.2048646]

Arbeláez P, Maire M, Fowlkes C, Malik J. 2011. Contour detection and hierarchical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5): 898-916 [DOI:10.1109/TPAMI.2010.161]

Chan S H, Wang X R, Elgendy O A. 2017. Plug-and-play ADMM for image restoration:fixed-point convergence and applications. IEEE Transactions on Computational Imaging, 3(1): 84-98 [DOI:10.1109/TCI.2016.2629286]

Chen J Q, Zhang G Z, Xu S P, Yu H W. 2019. A blind CNN denoising model for random-valued impulse noise. IEEE Access, 7: 124647-124661 [DOI:10.1109/ACCESS.2019.2938799]

Dong Y Q, Chan R H, Xu S F. 2007. A detection statistic for random-valued impulse noise. IEEE Transactions on Image Processing, 16(4): 1112-1120 [DOI:10.1109/TIP.2006.891348]

Garnett R, Huegerich T, Chui C, He W J. 2005. A universal noise removal algorithm with an impulse detector. IEEE Transactions on Image Processing, 14(11): 1747-1754 [DOI:10.1109/TIP.2005.857261]

Guo Y Q, Jia X P, Paull D. 2018. Effective sequential classifier training for SVM-based multitemporal remote sensing image classification. IEEE Transactions on Image Processing, 27(6): 3036-3048 [DOI:10.1109/TIP.2018.2808767]

Jin K H, Ye J C. 2018. Sparse and low-rank decomposition of a Hankel structured matrix for impulse noise removal. IEEE Transactions on Image Processing, 27(3): 1448-1461 [DOI:10.1109/TIP.2017.2771471]

Kumar S V and Nagaraju C. 2018. Support vector neural network based fuzzy hybrid filter for impulse noise identification and removal from gray-scale image. Journal of King Saud University-Computer and Information Sciences: 1-16[DOI: 10.1016/j.jksuci.2018.05.011]

Li K, Li Y M, Hu X M, Shao F. 2018. A robust and accurate object tracking algorithm based on convolutional neural network. Acta Electronica Sinica, 46(9): 2087-2093 (李康, 李亚敏, 胡学敏, 邵芳. 2018. 基于卷积神经网络的鲁棒高精度目标跟踪算法. 电子学报, 46(9): 2087-2093) [DOI:10.3969/j.issn.0372-2112.2018.09.007]

Liu L C, Chen C L P, Zhou Y C, You X G. 2015. A new weighted mean filter with a two-phase detector for removing impulse noise. Information Sciences, 315: 1-16 [DOI:10.1016/j.ins.2015.03.067]

Nie F P, Yang S, Zhang R, Li X L. 2019. A general framework for auto-weighted feature selection via global redundancy minimization. IEEE Transactions on Image Processing, 28(5): 2428-2438 [DOI:10.1109/TIP.2018.2886761]

Ongie G and Jacob M. 2016. A fast algorithm for structured low-rank matrix recovery with applications to undersampled MRI reconstruction//Proceedings of the 13th IEEE International Symposium on Biomedical Imaging. Prague, Czech Republic: IEEE: 522-525[DOI: 10.1109/ISBI.2016.7493322]

Roy A, Singha J, Devi S S, Laskar R H. 2016. Impulse noise removal using SVM classification based fuzzy filter from gray scale images. Signal Processing, 128: 262-273 [DOI:10.1016/j.sigpro.2016.04.007]

Soleimany and Hamghalam. 2017. A novel random-valued impulse noise detector based on MLP neural network classifier//Proceedings of 2017 Artificial Intelligence and Robotics (IRANOPEN). Qazvin, Iran: IEEE: 165-169[DOI: 10.1109/RIOS.2017.7956461]

Turkmen I. 2016. The ANN based detector to remove random-valued impulse noise in images. Journal of Visual Communication and Image Representation, 34: 28-36 [DOI:10.1016/j.jvcir.2015.10.011]

Xiong B, Yin Z P. 2012. A universal denoising framework with a new impulse detector and nonlocal means. IEEE Transactions on Image Processing, 21(4): 1663-1675 [DOI:10.1109/TIP.2011.2172804]

Xu G Y, Tan J Q. 2014. A universal impulse noise filter with an impulse detector and nonlocal means. Circuits, Systems, and Signal Processing, 33(2): 421-435 [DOI:10.1007/s00034-013-9640-1]

Xu S P, Zhang G Z, Hu L Y, Liu T Y. 2018a. Convolutional neural network-based detector for random-valued impulse noise. Journal of Electronic Imaging, 27(5): #050501 [DOI:10.1117/1.JEI.27.5.050501]

Xu Q, Li Y H, Guo Y J, Wu S, Sbert M. 2018b. Random-valued impulse noise removal using adaptive ranked-ordered impulse detector. Journal of Electronic Imaging, 27(1): #013001 [DOI:10.1117/1.JEI.27.1.013001]

Zhou W, David Z. 1999. Progressive switching median filter for the removal of impulse noise from highly corrupted images. IEEE Transactions on Circuits and Systems Ⅱ:Analog and Digital Signal Processing, 46(1): 78-80 [DOI:10.1109/82.749102]