发布时间: 2020-12-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.190682
2020 | Volume 25 | Number 12

图像处理和编码

自适应卷积的残差修正单幅图像去雨

王美华, 何海君, 李超

华南农业大学数学与信息学院, 广州 510642

收稿日期: 2019-12-30; 修回日期: 2020-03-30; 预印本日期: 2020-04-06

基金项目: 国家自然科学基金项目（61976052）；广东省基础与应用基础研究基金项目（2019B1515210009）

第一作者简介: 王美华, 1970年生, 女, 副教授, 硕士生导师, 主要研究方向为模式识别、机器学习、机器视觉。E-mail:wangmeihua@scau.edu.cn;
李超, 男, 硕士研究生, 主要研究方向为机器学习、计算机视觉。E-mail:rockylee@gmail.com.

中图法分类号: TP391.41

文献标识码: A

文章编号: 1006-8961(2020)12-2484-10

摘要

目的雨天户外采集的图像常常因为雨线覆盖图像信息产生色变和模糊现象。为了提高雨天图像的质量，本文提出一种基于自适应选择卷积网络深度学习的单幅图像去雨算法。方法针对雨图中背景误判和雨痕残留问题，加入网络训练的雨线修正系数（refine factor，RF），改进现有雨图模型，更精确地描述雨图中各像素受到雨线的影响。构建选择卷积网络（selective kernel network，SK Net），自适应地选择不同卷积核对应维度的信息，进一步学习、融合不同卷积核的信息，提高网络的表达力，最后构建包含SK Net、refine factor net和residual net子网络的自适应卷积残差修正网络（selective kernel convolution using a residual refine factor，SKRF），直接学习雨线图和残差修正系数（RF），减少映射区间，减少背景误判。结果实验通过设计的SKRF网络，在公开的Rain12测试集上进行去雨实验，取得了比现有方法更高的精确度，峰值信噪比（peak signal to noise ratio，PSNR）达到34.62 dB，结构相似性（structural similarity，SSIM）达到0.970 6。表明SKRF网络对单幅图像去雨效果有明显优势。结论单幅图像去雨SKRF算法为雨图模型中的雨线图提供一个额外的修正残差系数，以降低学习映射区间，自适应选择卷积网络模型提升雨图模型的表达力和兼容性。

关键词

单幅图像去雨; 深度学习; 选择卷积网络; 修正系数; 残差学习

Single image rain removal based on selective kernel convolution using a residual refine factor

Wang Meihua, He Haijun, Li Chao

College of Mathematics and Informatics, South China Agricultural University, Guangzhou 510642, China

Supported by: National Natural Science Foundation of China (61976052); Basic and Applied Basic Research Fund of Gunangdong Province (2019B1515210009)

Abstract

Objective Rain lines adversely affect the visual quality of images collected from outdoors. Severe weather conditions, such as rain, fog, and haze, can affect the quality of these images and make them unusable. These degraded images may also drastically affect the performance of man's vision system. Given that rain is a common meteorological phenomenon, an algorithm that can remove rain from single image is of practical significance. Given that video-based de-raining methods obtain pixel information of the same location at different periods, removing rain from an individual image is more challenging because of less available information. Traditional de-raining methods mainly focus on rain map modeling and use mathematical optimization to detect and remove rain streaks, but the performance of such approach requires further improvement. Method To address the above problems, this paper establishes a convolution neural network for single image rain removal that is trained on a synthetic dataset. The contributions of this work are as follows. 1) To expand the neural receptive field of a convolution neural network that learns abstract feature representation of rain streaks and the ground truth, this work establishes a selective kernel network based on multi-scale convolution with different kernel for feature learning. To accomplish useful information fusion and selection, an external non-linear weight learning mechanism is developed to redistribute the weight for the corresponding channel's feature information from different convolution kernels. This mechanism enables the network to select the feature information of different receptive fields adaptively and enhance its expression ability and rain removal capability. 2) The existing rain map model shows some limitations at the training stage. Completing this model by adding a learnable refine factor that modifies each pixel in a rain streak image, can enhance the accuracy of the result and prevent background misjudgment. The range of the refining factor is also limited to reduce the mapping range of the network training process. 3) At the training stage the existing single image rain removal networks need to learn various types of image content, including rain streaks removal and background restoration, which will undoubtedly increase their burden. By using the novel idea of residual learning the proposed network can directly learn the rain streak map by using the input rain map. In this way, the mapping interval of the network learning process is reduced, the background of the original graph can be preserved, and loss of details can be prevented. The validity of the above arguments is tested by designing a comparison network with different modules. Specifically, based on general convolution, different modules are combined step by step, including the SK net, residual learning mechanism, and refine factor learning net. Single image rain removal network based on selective kernel convolution using residual refine factor (SKRF) is eventually designed. The residual learning mechanism is used to reduce the mapping interval, and the refined factor is used to enhance the rain streak map to improve the rain removal performance. Result An SKRF network, including the three subnets of SK net, refine factor net, and residual net, is designed in a rain removal experiment and tested on the open Rain12 test set. This network achieves a higher accuracy, peak signal to noise ratio(PSNR) (34.62), and structural similarity(SSIM) (0.970 6) compared with the existing methods. The SKRF network shows obvious advantages in removing rain from single image. Conclusion We construct a convolution neural network based on SKRF to remove rain streaks from single image. A selective kernel convolution network is established to improve the expression ability of the proposed network via the adaptive adjustment mechanism of the size of the receptive field by the internal neurons. A rain map with different characteristics can be well learned, and the effect of rain removal can be improved. The residual learning mechanism can reduce the mapping interval of the network learning process and retain more details of the original image. In the modified rain map model, an additional refine factor is provided for the rain streak map, which can further reduce the mapping interval and reduce background misjudgment. This network not only removes the majority of the visible rain streaks but also retains the ground truth. In our feature work, we plan to extend this network to a wider range of image restoration tasks.

Key words

single image rain removal; deep learning; selective kernel network(SK Net); refine factor (RF); residual learning

0 引言

计算机视觉效果取决于图像质量。户外采集的图像经常受到雨、雪和雾等天气的影响。雨天作为自然界常见的天气之一，图像去雨是计算机视觉研究的一个重要分支，在自动驾驶、视频监控等技术中有着一定的应用价值。

传统的单幅去雨方法取得了一定效果。Kang等人(2011)使用双边滤波器将雨图分解，转化为高频图和低频图。高频图使用稀疏编码和字典学习去除雨线，然后与低频信息结合得到去雨后的图像。虽然整个方法有一定的启发性，但得到的去雨结果比较模糊。Luo等人(2015)利用稀疏编码分离有雨图像中的雨线和背景图像，设计了一种非线性混合雨线模型，虽然估计出的背景图像比较清晰，但是由于获取雨线信息的难度大，雨线残留问题没有得到很好解决。Li等人(2016)利用高斯混合模型，通过计算不同角度和形状雨线的分布，实现雨线的检测和去除。该方法虽然得到了诸多模型中最好的结果，但是这种方法依赖于模型的构建且计算较为复杂，估计的背景图像部分细节信息丢失，导致背景图像不够清晰。

深度学习在计算机视觉中广泛应用，在图像识别(Simonyan和Zisserman，2014)、目标检测(Ren等，2017a, b)、视频处理(Kang等，2012)等领域都取得了很大成果。在图像去雨算法中，卷积神经网络也发挥了强大作用。Fu等人(2017a)基于深度积卷神经网络框架提出DerainNet去雨网络，首先使用低通滤波器将有雨图和无雨图分解为高频图和低频图，然后将高频对应部分在DerainNet中进行训练，使得卷积网络能够学习到两幅高频图像之间的映射，然后与原图的低频图结合得到去雨后的图像。该方法在实验中取得了良好效果，但是背景图像恢复得不够完善。此外，Fu等人(2017b)在深度残差网络学习形式启发下，提出了DetailNet来解决去雨问题，使用负残差映射，缩小网络的映射区间，并通过实验证明了网络框架的去雨能力。Yang等人(2017)对雨图中的雨线进行检测，针对不同程度的雨线密度进行去雨操作，设计了一种多任务的深度神经网络，以递归的方式进行雨线检测和去除，实验证明该方法对暴雨场景非常有效。Zhang和Patel(2018)提出了一种能够自动确定雨线密度信息的多流密集网络，根据雨线密度评估出对应的标签，利用标签信息指导去雨。该方法在有标签的有雨图像数据集上取得了良好效果，但对于真实雨图给出的标签并不精确。Fu等人(2019)借鉴特征金字塔网络的思想，针对主流去雨网络结构复杂和参数过多，难以在实际场景中应用的问题，提出了LPNet网络。该网络由多个网络堆叠构成，对输入雨图进行多尺度的处理，简化学习过程从而减少网络的参数量，大幅增强了该方法的实用性。Zhang等人(2019)首次提出基于生成式对抗网络(generative adversarial network，GAN)(Goodfellow等，2014)的ID-CGAN(image deraining-conditional generative adversarial network)方法用于单幅图像去雨。该方法的生成器(generator)采用密集连接网络并且加入限制函数，使生成的去雨图像和无雨图像完全一致，同时，判别器(discriminator)使用多尺度特征处理的结构。实验证明，ID-CGAN在真实的有雨图像上能取得良好效果，但是对暴雨情况下拍摄的有雨图像去雨效果不好。Wang等人(2020)首次将SE(squeeze-and-excitation network)(Hu等, 2018)网络模块加入到神经网络中，提出了ROISEN(reusing original input squeeze-and-excitation network)网络。SE模块提取原始图像的细节信息，提升了网络的去雨质量。实验证明了该网络的有效性，但是对于与背景物体相似的雨线去除得不够完全。

传统的去雨算法多基于数学建模进行求解优化，在运行速度上很难有保障，效果也还有提升空间。而使用深度学习的去雨算法虽然取得了一定成效，但依然有两个局限：1)部分网络模型的结构相对复杂，包括增加网络层数、分支来增加网络表达力等，提高了对算力的要求；2)图像分解技术的使用，减少了学习的映射区间，增加了额外的步骤。

本文针对单幅图像去雨任务，在有限的网络空间中寻求更好的网络表达方式，建立了端到端的图像去雨网络，主要包括：1)构建选择卷积网络，提供一个不同尺寸卷积核特征通道之间的学习机制，使得神经元可以自适应地调整感受野大小，增强网络的表达能力，提升去雨效果；2)改进简单雨图模型，加入额外的可学习的修正系数，学习每个像素点受到雨影响的程度，防止背景误判；3)采用残差学习思想，减少网络学习过程的映射区间，同时使得原图的背景细节得以保留。

1 雨图模型

基于深度学习的去雨模型已取得初步效果，这些模型大多数将雨图认为是背景层和雨层相加(Kang等，2012；Li等，2016)，即

$\boldsymbol{I}=\boldsymbol{B}+\boldsymbol{R} $

(1)

式中，$\mathit{\boldsymbol{I}}$代表输入原有雨图，$\mathit{\boldsymbol{B}}$代表期望输出无雨图，$\mathit{\boldsymbol{R}}$代表雨线图。然而实际采集到的图像不仅有明显的雨点和雨线，还有细微的雨丝以及雨水分散形成的水雾，同时光线在雨水作用下会产生折射，从而影响有雨部分周围像素的亮度，这种简单的模型存在局限性。在上述雨图模型基础上，本文为雨线造成的影响提供一个额外变量，改进为基于系数修正的雨图模型，即

$\boldsymbol{I}=\boldsymbol{B}+(1+\alpha) \cdot \boldsymbol{R} $

(2)

式中，$\mathit{\boldsymbol{B}}$代表无雨图，$\mathit{\boldsymbol{R}}$为修正前的雨线结果，1+$α$代表$\mathit{\boldsymbol{R}}$对应像素修正系数。由于网络最终学习的结果为有雨图和无雨图的差值，为了降低学习过程中的映射区间，使得$\mathit{\boldsymbol{R}}$数值更小，额外的系数需为不小于1的数值，使雨线得到加强。设计的额外系数为1+$α$，在满足上述条件下降低了$α$的训练难度。1+$α$和$\mathit{\boldsymbol{R}}$进行点乘得到最终的雨线图。训练过程中$α$和$\mathit{\boldsymbol{R}}$同时学习，共同决定去雨结果。

2 自适应选择卷积网络

卷积神经网络研究火热，He等人(2016a)提出残差连接ResNet解决深度神经网络的退化问题。Szegedy等人(2016)利用多卷积核GoogLeNet提取图像不同尺度信息。Huang等人(2017)使用子模块DenseNet将特征图都连在一起，充分发挥它们的性能。Yu等人(2017)提出空洞卷积、Howard等人(2017)提出深度可分离卷积降低网络参数量。

选择卷积网络(selective kernel network，SK Net)通过多核非线性方法实现感受野大小对应通道信息的自适应调节(Li等，2019)，提升网络表达能力。自适应选择卷积网络由分裂、融合和选择3部分组成，如图 1(Li等，2019)所示。

图 1 自适应选择卷积网络结构图

Fig. 1 Structure of selective kernel convolution network

分裂算子利用不同尺寸的卷积核产生两个特征提取的路径，对于特征$\boldsymbol{X} \in {{\bf{R}}}^{H^{\prime} \times W^{\prime} \times C^{\prime}}$，分别使用两路卷积得到特征图$\overline{\boldsymbol{U}} \in {\bf{R}}^{H \times W \times C}$和$\widetilde{\boldsymbol{U}} \in {\bf{R}}^{H \times W \times C}$，即

$\begin{aligned} \overline{\boldsymbol{U}} &=\delta\left(\beta\left({Con}v_{3 \times 3}(\boldsymbol{X})\right)\right) \\ \widetilde{\boldsymbol{U}} &=\delta\left(\beta\left({Con}v_{5 \times 5}(\boldsymbol{X})\right)\right) \end{aligned} $

(3)

式中，$Conv_{i×i}(·)$代表卷积操作，$i$代表卷积核尺寸大小，$β$代表批归一化(Ioffe和Szegedy，2015)，$δ$代表ReLU激活函数(Nair和Hinton，2010), $\mathit{\boldsymbol{R}}$表示矩阵。

融合算子将两路得到的特征图进行加合，得到新特征图$\mathit{\boldsymbol{U}}$，即

$\boldsymbol{U}=\overline{\boldsymbol{U}}+\widetilde{\boldsymbol{U}} $

(4)

特征图$\mathit{\boldsymbol{U}}∈ {\bf{R}} ^{H×W×C}$全局平均池化得到$\mathit{\boldsymbol{m}}$，它表示平均池化得到的结果向量。$\mathit{\boldsymbol{m}}∈ {\bf{R}} ^{C}$，其中，$m_{k}$由$\mathit{\boldsymbol{U}}$中第$k$个通道的特征图$\mathit{\boldsymbol{U}}_{k}$通过全局平均池化(global pooling，gp)操作$ψ_{{\rm{gp}}}(·)$得到

$m_{k}=\psi_{\mathrm{gp}}\left(\boldsymbol{U}_{k}\right)=\frac{1}{H \times W} \sum\limits_{i=1}^{H} \sum\limits_{j=1}^{W} U_{k}(i, j) $

(5)

式中，$k∈[0, 1, …, C-1]$。$H$和$W$分别代表特征图的高度和宽度，$i, j$代表特征图中元素的坐标。

自适应选择通过一个全连接层(full connection，fc)$ψ_\rm{fc}(·)$得到$\mathit{\boldsymbol{n}}$，具体为

$\boldsymbol{n}=\psi_{\mathrm{fc}}(\boldsymbol{m})=\delta(\beta(\boldsymbol{w} \boldsymbol{m})) $

(6)

式中，$\mathit{\boldsymbol{n}}∈ {\bf{R}} ^{d×1}$代表融合后的特征向量，$\mathit{\boldsymbol{w}}∈ {\bf{R}} ^{d×C}$代表权值，$d$为输出维度。$β$代表批归一化(Ioffe和Szegedy，2015)，$δ$代表ReLU激活函数(Nair和Hinton，2010)。

选择算子根据不同大小内核的特征聚合信息，计算映射到$\overline{\boldsymbol{U}}$和$\tilde{\boldsymbol{U}}$各自对应通道的权重，具体为

$a_{k}=\frac{\mathrm{e}^{\boldsymbol{A}_{k} n}}{\mathrm{e}^{\boldsymbol{A}_{k^{n}}}+\mathrm{e}^{\boldsymbol{B}_{k^{n}}}}, b_{k}=\frac{\mathrm{e}^{\boldsymbol{B}_{k} n}}{\mathrm{e}^{\boldsymbol{A}_{k^{n}}}+\mathrm{e}^{\boldsymbol{B}_{k^{n}}}} $

(7)

式中，$k∈[0, 1, …, C-1]$，$\mathit{\boldsymbol{A}}, \mathit{\boldsymbol{B}}∈ {\bf{R}} ^{C×d}$为学习对应特征图的权重矩阵，并对同维度的两个权值使用softmax进行归一化得到权值向量$\mathit{\boldsymbol{a}}、\mathit{\boldsymbol{b}}$，$\mathit{\boldsymbol{A}}_{k}$为权值矩阵$\mathit{\boldsymbol{A}}$中第$k$个向量，$a_{k}$为$\mathit{\boldsymbol{a}}∈ {\bf{R}} ^{C×1}$的第$k$个元素，最终作用于$\overline{\boldsymbol{U}}$的第$k$个通道。类似地，$\mathit{\boldsymbol{B}}_{k}$为权值矩阵$\mathit{\boldsymbol{B}}$中第$k$个向量，$b_{k}$为$\mathit{\boldsymbol{b}}∈ {\bf{R}} ^{d×1}$的第$k$个元素。最终，通过$a_{k}$和$b_{k}$实现对应卷积核产生特征图通道各个通道信息的选择，具体为

$\boldsymbol{Y}_{k}=a_{k} \cdot \overline{\boldsymbol{U}}_{k}+b_{k} \cdot \widetilde{\boldsymbol{U}}_{k} $

(8)

式中，$k∈[0, 1, …, C-1]$，$\mathit{\boldsymbol{Y}}∈ {\bf{R}} ^{H×W×C}$为最终输出的特征图，$\mathit{\boldsymbol{Y}}_{k}$为第$k$个特征图。$a_{k}$与$b_{k}$的关系为

$a_{k}+b_{k}=1 $

(9)

算法1 自适应选择卷积网络算法输入：特征图$\mathit{\boldsymbol{X}}$

过程：

$\overline{\boldsymbol{U}}=δ(β(Conv_{3×3}(X)))，\tilde{\boldsymbol{U}}=δ(β(Conv_{5×5}(\mathit{\boldsymbol{X}})))$

$\mathit{\boldsymbol{U}}=\overline{\boldsymbol{U}}+\tilde{\boldsymbol{U}}$

$k$ = 0

for all $\mathit{\boldsymbol{U}}_{k}∈ \mathit{\boldsymbol{U}}$ do

$m_{k}=ψ_{\rm{gp}}(\mathit{\boldsymbol{U}}_{k})= \frac{1} {H×W} ∑\limits^H_{i=1} ∑\limits^W_{j=1} U _{k}(i, j)$

$\mathit{\boldsymbol{n}}=ψ_{\rm{fc}}(\mathit{\boldsymbol{m}})=δ(β(\mathit{\boldsymbol{wm}}))$

${a_k} = \frac{{{{\rm{e}}^{{\mathit{\boldsymbol{A}}_k}n}}}}{{{{\rm{e}}^{{\mathit{\boldsymbol{A}}_k}n}} + {{\rm{e}}^{{\mathit{\boldsymbol{B}}_{{k^n}}}}}}}, {b_k} = \frac{{{{\rm{e}}^{{\mathit{\boldsymbol{B}}_{{k^n}}}}}}}{{{{\rm{e}}^{{\mathit{\boldsymbol{A}}_{{k^n}}}}} + {{\rm{e}}^{{\mathit{\boldsymbol{B}}_{{k^n}}}}}}}$

$\mathit{\boldsymbol{Y}}_{k}=a_{k}·\overline{\boldsymbol{U}}_{k}+b_{k}·\tilde{\boldsymbol{U}}_{k}, a_{k}+b_{k}=1$

$k $++

end for until $k$ = $C$-1

输出：特征图$\mathit{\boldsymbol{Y}}$

3 方法

3.1 残差映射连接

神经网络最终需要学习的目标为有雨图到无雨图的映射。由于雨图中带有雨的像素值比背景像素值高，Fu等人(2017a)将图像分为高频部分和低频部分。这种方式减少了训练过程中的映射区间，但是增加了额外步骤，增加了算法的复杂度，不能很好地保留图像细节。借鉴残差网络思想(He等，2016a, b)，本文网络直接利用原图学习雨图特征。

在输入原图像的情况下学习雨线，不会引入额外操作，同时可以使得网络在训练过程中减少拟合无雨区域的步骤，降低训练难度，能更好地保留原图细节。

3.2 网络结构

为了达到同时训练雨图和修正系数的目的，本文设计的自适应卷积的残差修正网络(selective kernel convolution using a residual refine factor，SKRF)如图 2所示。输入为一幅RGB通道的有雨图像，首先使用卷积网络进行特征提取。然后网络分为两路:一路是雨线检测网络，使用自适应选择卷积网络学习不同感受野的特征; 另一路是修正系数网络，使用正常卷积，网络输出的特征图中每个值加1，为最终的修正系数。最后的残差结果由雨线图和修正系数共同决定。

图 2 SKRF网络结构图

Fig. 2 Network structure of SKRF

3.3 网络参数设置

本文网络的参数设置包括雨线检测网络的参数设置和修正系数网络的参数设置。

雨线检测网络的参数设置如表 1所示。第1层为特征提取网络，卷积核设置为9 × 9的大尺寸卷积核(Dong等，2016)，输出特征图数为128，获取输入图像中足够丰富的信息。两路网络均使用1 × 1卷积降低特征图的维度，减少计算量的同时增加网络的非线性表达能力。实验表明，综合网络参数量和训练阶段损失函数收敛效果最佳的维度是48。

表 1 雨线网络结构设置
Table 1 Rain streaks network structure settings

下载CSV

层	卷积	输出维度	激活函数	批归一化
1	9×9	128	ReLU	true
2	1×1	48	ReLU	true
3	SK	48	ReLU	true
4	SK	48	ReLU	true
5	3×3	3	-	false
注：“-”表示未使用激活函数。

雨线检测网络中的两层网络使用自适应选择卷积网络。选择卷积网络中，两个卷积分别为尺寸为3 × 3的常规卷积和尺寸为3 × 3放缩率为2的空洞卷积(Yu等，2017)，这样在卷积核参数量不变的情况下，提升了网络的感受野，两种卷积均带有ReLU激活函数和批归一化操作。本实验选择卷积的融合操作向量$\mathit{\boldsymbol{n}}∈ {\bf{R}}^{d×1}$，其中$d$=16。

修正系数网络的参数设置如表 2所示。使用正常卷积，特征图数目为48。网络前4层均使用批归一化处理(Ioffe和Szegedy，2015)和ReLU激活函数(Nair和Hinton，2010)。第5层的输出层包括雨线图网络和修正系数网络。修正系数输出层使用Sigmoid函数，输出[0, 1]之间的数值，再加上额外的常量1，最终的修正系数范围为[1, 2]。雨线图的输出参照了常规的网络输出，没有使用激活函数。雨线图和修正系数相乘得到残差值，最后的结果图像由输入图像减去残差值得到，即

表 2 修正系数网络结构设置
Table 2 Refine network structure settings

下载CSV

层	卷积	输出维度	激活函数	批归一化
1	9×9	128	ReLU	true
2	1×1	48	ReLU	true
3	3×3	48	ReLU	true
4	3×3	48	ReLU	true
5	3×3	3	Sigmoid	false

$\boldsymbol{B}=\boldsymbol{I}-(1+\alpha) \cdot \boldsymbol{R} $

(10)

3.4 数据集和训练设置

数据集分为训练数据集和测试数据集两部分。在训练阶段，需要为卷积网络提供有雨图及对应的无雨图。现实采集过程中直接获取同一个场景的无雨图像和有雨图像很困难，因为即使能使相机的位置完全不变，但是拍摄时的亮度等周围环境条件也会有差异。现有的基于深度学习的去雨算法大都采用合成雨图的方式训练网络(Fu等，2017a)，即在无雨图上通过Photoshop加入不同形状大小的雨线，合成过程中需保证与真实情况接近。

本文收集的数据集共包含300对图像，由于图像的尺寸大小不统一，读取过程中，为了保证同批次的数据的差异性，对每个批次的图像，从打乱队列的图像中读取4幅图像，然后随机选择32个33×33像素大小的图像区域，形成一个批次为128的训练数据，提供给网络训练使用。最终分割的小图像对共40 000对。图像RGB三通道数值范围为[0, 255]，为了方便网络训练，在数据处理阶段统一归一化到[0, 1]。网络损失函数为均方误差，批次为128，使用指数衰减学习率的方式，初始值为0.01，衰减系数为0.9，衰减步数10 k, 训练迭代次数为500 k次，优化器选择自适应学习率优化算法。

4 实验结果与分析

4.1 不同模块实验对比

为了证明各个模块的有效性，设置包含不同模块的对比网络进行实验。在参数量基本一致的情况下，在传统的卷积网络基础上，分别使用选择卷积模块、残差学习方式以及修正系数进行网络构建。

设计的对比网络如表 3所示。PCNN(plain convolutional neural network)普通卷积，SK(selective kernel)选择卷积，SKR(selective kernel and residual)选择卷积网络和残差连接。实验时，3个网络保证参数量基本一致，在未使用修正系数的情况下，分别增加相应的特征图数目。SKRF则在使用选择卷积网络的同时，加入残差映射连接和修正系数。

表 3 网络结构设置对比
Table 3 Comparison of networks structure setting

下载CSV

方法	选择卷积	残差	修正系数
PCNN	否	否	否
SK	是	否	否
SKR	是	是	否
SKRF(本文)	是	是	是

在测试阶段，使用公开的合成数据集Rain12(Li等，2016)和自制的10幅图像合成测试集(https://pan.baidu.com/s/1Iw5f6jtTbIkNt-EptVNB0 g), 称为本地数据集。采用峰值信噪比(peak signal to noise ratio，PSNR)(Huynh和Ghanbari，2008)和结构相似性(structural similarity，SSIM)(Wang等，2004)作为评价指标。两种指标的得分值都与去雨效果正相关，数值越高代表去雨效果越好。

表 4是不同方法在Rain12和本地数据集上的测试结果。SK模块和修正系数均对网络的效果有不同程度的提升。由于数据集的雨线特征不同，带有SK模块的网络在本地数据集中表现更好。本地数据集的雨线相对较长，选择卷积核融合模块取得了更好的效果。在Rain12数据集中，雨线亮度较高，在残差映射的网络中得分稍高。使用修正系数后，PSNR和SSIM在两个数据集上都取得了最高值，证明了修正系数对算法提升的有效性。

表 4 不同方法在Rain12和本地数据集上的测试结果
Table 4 Test results of algorithms on Rain12 and our dataset

下载CSV

方法	Rain12数据集		本地数据集
方法	PSNR/dB	SSIM	PSNR/dB	SSIM
PCNN	32.68	0.966 1	28.04	0.944 2
SK	33.74	0.963 2	30.10	0.960 1
SKR	34.10	0.960 2	29.96	0.960 9
SKRF(本文)	34.62	0.970 6	30.17	0.961 4
注：加粗字体为各列最优结果。

4.2 算法验证与对比

为验证SKRF网络的有效性，与两个传统算法DSC(discriminative sparse coding)(Luo等，2015)和LP(layer priors)(Li等，2016)、两种基于深度学习的算法DT(DetailNet)(Fu等，2017)和DR(deraining convolution neural network, DRCNN)(Wang等，2018)进行对比，在Rain12和本地数据集上的测试结果如表 5所示。从最终测试结果可以看出，本文方法在两个数据集上均取得最优值，证明了本文算法的优越性。

表 5 不同算法在Rain12和本地数据集上测试结果对比
Table 5 Test comparison on Rain12 dataset and ours

下载CSV

方法	Rain12数据集		本地数据集
方法	PSNR/dB	SSIM	PSNR/dB	SSIM
DSC	28.95	0.883 6	26.69	0.889 1
LP	32.21	0.939 2	24.22	0.883 4
DR	31.36	0.961 6	29.94	0.950 4
DT	34.17	0.968 2	27.61	0.932 5
SKRF(本文)	34.62	0.970 6	30.17	0.961 4
注：加粗字体为各列最优结果。

4.3 合成图像去雨结果对比

去雨图像的质量取决于雨线去除和背景保留情况。为验证本文算法的有效性，分别在合成数据集和真实数据集上进行测试。

图 3是公开合成数据集Rain12上的部分有雨图像处理结果。可以看出，传统算法中的DSC和LP算法的去雨效果并不理想，LP算法还出现了过度平滑现象。DRCNN算法依然存在一些雨线残留。DetailNet和本文提出的SKRF网络都取得了很好的结果，基本看不出雨线残留，但从图中船桨处可以看到，DetailNet残留部分细节未能处理，且存在一定噪声。总体来说，本文提出的SKRF网络在Rain12数据集上的综合表现更有优势。

图 3 合成图像去雨结果

Fig. 3 Rain removal results on synthetic image ((a)a rainy image; (b) DSC; (c) LP; (d) DRCNN; (e) DetailNet; (f) SKRF(ours))

4.4 真实雨图处理结果对比

为了证明本文算法的泛化能力，选择常用的真实雨图进行测试。由于真实雨图没有标签图像，只能通过图像结果来评估性能，部分测试结果如图 4所示。从处理结果可以看出，传统的DSC和LP算法的处理结果中有明显的雨线残留现象，且LP算法依然出现过度平滑现象。DRCNN和DetailNet在去雨效果上有了一定提升，但也依然残留着细微的雨线，DRCNN的处理结果偏暗，而DetailNet出现了部分细节丢失现象，如图 4(e)中，DetailNet算法误将图中人物的手臂褶皱当做了雨线处理，而SKRF则很好地将此褶皱保留了下来，雨线去除和背景保留上都表现出明显优势。

图 4 真实图像去雨结果

Fig. 4 Rain removal results on real image

((a) rainy image; (b) DSC; (c) LP; (d) DRCNN; (e) DetailNet; (f) SKRF(ours))

5 结论

本文针对单幅图像去雨任务建立端到端的神经网络。构建选择卷积网络，能够自适应地选择不同尺寸卷积核对应通道的特征信息。并在基本雨图模型基础上加入修正系数，更加精确地表达每个像素点受到雨线影响。同时采用残差学习思想，利用雨图直接学习雨线图，减少映射区间，保留原图细节。

实验通过合成数据集进行训练与测试，用真实雨图验证算法的有效性。对比实验表明，选择卷积网络的自适应调整感受野大小机制能够提升单幅图像去雨效果。残差学习网络与修正系数网络引入，修正背景误判，从而获得更好的去雨效果。构建的自适应卷积残差修正单幅去雨算法SKRF，与现有其他算法相比，PSNR和SSIM两个客观评价指标分值较高，优势明显；在真实雨图的去雨效果上有较大提升。

然而，深度学习的单幅图像算法普遍存在局限性，需要包含有雨图和对应无雨图的数据集进行训练，现有数据集的特征覆盖能力有限，本文用到的数据集虽然包含尽可能多的雨线特征，但是依然属于合成雨图，与真实雨天拍摄的图像存在一些差别。真实情况下的有雨图及其对应的完全契合的无雨图采集难度大，如何构建真实雨天状况下的有雨图和与之对应的无雨图是个难解问题。同时，对于算法的实际落地应用，比如路面监控场景实时去雨对算法的效率要求很高。上述问题都是未来的重要研究方向。

参考文献

Dong C, Loy C C, He K M, Tang X O. 2016. Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2): 295-307 [DOI:10.1109/TPAMI.2015.2439281]

Fu X Y, Huang J B, Ding X H, Liao Y H, Paisley J. 2017a. Clearing the skies:a deep network architecture for single-image rain removal. IEEE Transactions on Image Processing, 26(6): 2944-2956 [DOI:10.1109/TIP.2017.2691802]

Fu X Y, Huang J B, Zeng D L, Huang Y and Paisley J. 2017b. Removing raoh from single images via a deep detail network//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Honolulu, USA: IEEE: 3855-3863

Fu X Y, Liang B R, Huang Y, Ding X H, Paisley J. 2019. Lightweight pyramid networks for image deraining. IEEE Transactions on Neural Networks and Learning Systems, 31(6): 1794-1807 [DOI:10.1109/TNNLS.2019.2926481]

Goodfellow I J, Pouget-Abadie J and Mirza M. 2014. Generative adversarial networks//Advances in Neural Information Processing Systems, 3: 2672-2680[DOI: 10.13140/RG.2.2.31946.62401]

He K M, Zhang X Y, Ren S Q and Sun J. 2016a. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 770-778[DOI: 10.1109/CVPR.2016.90]

He K M, Zhang X Y, Ren S Q and Sun J. 2016b. Identity mappings in deep residual networks//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer: 630-645[DOI: 10.1007/978-3-319-46493-0_38]

Howard A G, Zhu M L, Chen B, Kalenichenko D, Wang W J, Weyand T, Andreetto M and Adam H. 2017. Mobilenets: efficient convolutional neural networks for mobile vision applications[EB/OL]. https://arxiv.org/pdf/1704.04861.pdf

Hu J, Shen L, Albanie S, Sun G and Wu E.2018. Squeeze-and-excitation networks//Proceedings of 2018 IEEE Conference on Computer vision and Pattern Recognition. Salt Lake City, USA: IEEE: 7132-7141[DOI: 10.1109/CVPR.2018.00745]

Huang G, Liu Z, van der Maaten L and Weinberger K Q. 2017. Densely connected convolutional networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 2261-2269[DOI: 10.1109/CVPR.2017.243]

Huynh Thu Q, Ghanbari M. 2008. Scope of validity of PSNR in image/video quality assessment. Electronics Letters, 44(13): 800-801 [DOI:10.1049/el:20080522]

Ioffe S and Szegedy C. 2015. Batch normalization: accelerating deep network training by reducing internal covariate shift//Proceedings of the 32nd International Conference on International Conference on Machine Learning. Lille, France: ACM: 448-456

Kang L W, Lin C W, Fu Y H. 2012. Automatic single-image-based rain streaks removal via image decomposition. IEEE Transactions on Image Processing, 21(4): 1742-1755 [DOI:10.1109/TIP.2011.2179057]

Kang L W, Lin C W and Fu Y H.2011.Automatic single-image-based rain streaks removal via image decomposition.//IEEE Transactions on Image Processing, 2011, 21(4): 1742-1755[DOI: 10.1109/TIP.2011.2179057]

Li X, Wang W H, Hu X L and Yang J. 2019. Selective kernel networks//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 510-519[DOI: 10.1109/CVPR.2019.00060]

Li Y, Tan R T, Guo X J, Li J B and Brown M S. 2016. Rain streak removal using layer priors//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 2736-2744[DOI: 10.1109/CVPR.2016.299]

Lin T Y, Dollár P, Girshick R, He K M, Hariharan B and Belongie S. 2017. Feature pyramid networks for object detection//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 2117-2125[DOI: 10.1109/CVPR.2017.106]

Luo Y, Xu Y and Ji H. 2015. Removing rain from a single image via discriminative sparse coding//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 3397-3405[DOI: 10.1109/ICCV.2015.388]

Nair V and Hinton G E. 2010. Rectified linear units improve restricted Boltzmann machines//Proceedings of the 27th International Conference on International Conference on Machine Learning. Haifa, Israel: Omnipress: 807-814

Ren S Q, He K M, Girshick R, Sun J. 2017a. Faster R-CNN:towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6): 1137-1149 [DOI:10.1109/TPAMI.2016.2577031]

Ren W H, Tian J D, Han Z, Chan A and Tang Y D. 2017b. Video desnowing and deraining based on matrix decomposition//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 2838-3847[DOI: 10.1109/CVPR.2017.303]

Simonyan K and Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition[EB/OL]. https://arxiv.org/pdf/1409.1556.pdf

Szegedy C, Vanhoucke V, Ioffe S, Shlens J and Wojna Z. 2016. Rethinking the inception architecture for computer vision//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 2818-2826[DOI: 10.1109/CVPR.2016.308]

Wang M H, Mai J M, Cai R C, Liang Y, Wan H. 2018. Single image deraining using deep convolutional networks. Multimedia Tools and Applications, 77(19): 25905-25918 [DOI:10.1007/s11042-018-5825-8]

Wang M, Chen L, Liang Y, Hao Y, He H and Li C. 2020. Single image rain removal with reusing original input squeeze-and-excitation network. IET Image Processing: 1467-1474[DOI: 10.1049/iet-ipr.2019.0716]

Wang Z, Bovik A C, Sheikh H R, Simoncelli E P. 2004. Image quality assessment:from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4): 600-612 [DOI:10.1109/TIP.2003.819861]

Yang W H, Tan R T, Feng J S, Liu J Y, Guo Z M and Yan S C. 2017. Deep joint rain detection and removal from a single image//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 1685-1694[DOI: 10.1109/CVPR.2017.183]

Yu F, Koltun V and Funkhouser T. 2017. Dilated residual networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 636-644[DOI: 10.1109/CVPR.2017.75]

Zhang H and Patel V M. 2018. Density-aware single image de-raining using a multi-stream dense network//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE: 1685-1694[DOI: 10.1109/CVPR.2018.00079]

Zhang H, Sindagi V and Patel V M. 2019. Image de-raining using a conditional generative adversarial network. IEEE Transactions on Circuits and Systems for Video Technology: #99[DOI: 10.1109/TCSVT.2019.2920407]