SAR图像目标识别的卷积神经网模型

 收稿日期: 2018-03-13; 修回日期: 2018-06-20 第一作者简介: 林志龙, 1994年生, 男, 陆军工程大学无人机工程系硕士研究生, 主要研究方向为SAR图像目标识别。E-mail:601386022@qq.com;王长龙, 男, 教授, 博士生导师, 主要研究方向为无人机信息处理与传输技术。E-mail:wang-oec@126.com;胡永江, 男, 副教授, 硕士生导师, 主要研究方向为计算机视觉与无人机图像信息处理技术。E-mail:460500180@qq.com;张岩, 男, 博士, 主要研究方向为计算机视觉与无人机信息处理技术。E-mail:hillwind@126.com. 中图法分类号: TN957.52 文献标识码: A 文章编号: 1006-8961(2018)11-1733-09

# 关键词

Convolution neural network model for SAR image target recognition
Lin Zhilong, Wang Changlong, Hu Yongjiang, Zhang Yan
Department of Unmanned Aerial Vehicle, Army Engineering University, Shijiazhuang 050003, China

# Abstract

Objective Synthetic aperture radar (SAR) is an important means of earth observation considering its all-weather, day-and-night, and penetrating imaging capabilities. SAR has been extensively used in battlefield detection and intelligence acquisition. SAR is a kind of electromagnetic wave coherent imaging system. A SAR image not only has variability but also has a strong speckle noise, which leads to considerable difficulties in target recognition of a SAR image. A manual interpretation of numerous SAR image data is difficult given the diversity of SAR image acquisition methods. A SAR automatic target recognition can effectively improve the utilization efficiency of SAR image data. However, the current SAR image target recognition algorithm has two main problems. First, the characteristics of target recognition, such as edge, corner, contour, texture, and other low-level features, are not representative. Second, in the traditional SAR image target recognition method, an effective filtering algorithm is crucial, but the filtering process is time-consuming. A convolutional neural network model is presented in this study to solve the problems of time-consuming filtering process and low recognition accuracy in the SAR target recognition. Method First, a network structure of the feature extraction part was specifically designed for the characteristics of SAR images, which are slightly different from optical images. We must design a reasonable network structure for the characteristics of SAR images. First, a SAR image that reflects a target radar echo intensity is a gray image because the feature information is less in a SAR image than in an optical image. Second, speckle noise inevitably exists in the SAR image. Third, the pixel size of the target is small because of the resolution limitation of the SAR image. Owing to the characteristics of SAR images, the convolutional neural network applied to SAR image target recognition must use a small convolution kernel and an appropriate convolution layer number. The feature extraction part of the proposed convolutional neural network model consists of four convolutional layers, four nonlinear layers, and two pooling layers. Second, an L2 norm was introduced to the cost function to improve the anti-noise and generalization performances of the model. Theoretical deduction shows the means by which the L2 norm enhances the noise immunity and generalization performance of the model. Third, Dropout reduced the computational complexity of the network and improved the generalization performance of the model. Dropout is a regularization technique for the reduction of overfitting in neural networks by preventing complex co-adaptations in training data. Dropout is an efficient technique for conducting model averaging with neural networks. Finally, the influence of filtering on the convergence speed and accuracy of the network was investigated. In the traditional SAR image target recognition method, the effective filtering algorithm is crucial, but the filtering process is time-consuming. Result Experimental data were obtained from the United States Moving and Stationary Target Acquisition and Recognition database. Experimental results of 10 types of target recognition showed that the overall recognition rate (including the variant) of the improved convolutional neural network increased from 93.76% to 98.10%. The improved feature extraction network structure extracts effective target features, thus improving the accuracy of the model. The accuracy of target variant recognition in SAR images had also been considerably improved. Notably, L2 regularization and Dropout enhanced the generalization performance of the model. Different sets of comparative experiments were set up to illustrate the effectiveness of improving and optimizing the network structure. The accuracy rate decreased from 98.10% to 97.06% when the first layer uses a 9×9 convolution kernel instead of two cascaded 5×5 convolution kernels. The accuracy of network identification increases from 94.91% to 96.19% when using L2 regularization, thereby indicating that L2 regularization can effectively improve the accuracy of network identification. Dropout increases the fluctuation range of the recognition rate, thus increasing the recognition accuracy to the highest level. Noise suppression experiments on the convolutional neural network were conducted to analyze the effects of three filtering methods, namely, Lee, bilateral, and Gamma MAP (Maximum A Posteriori), on the training process and results of the model. The experiments verified that the feature extraction process of the convolutional neural network can suppress the speckle noise of the SAR image and can save time during the filtering process. The filtering process consumes additional time, does not improve the convergence speed of convolutional neural network training, and decreases the recognition accuracy because it may filter out effective target recognition features, such as target texture, thus resulting in a decrease in recognition accuracy. Conclusion The convolutional neural network model proposed in this study improves the accuracy and generalization of the network, does not require a time-consuming filtering process, and is an effective method for target recognition of SAR images.

# Key words

synthetic aperture radar (SAR); automatic target recognition (ATR); convolutional neural network (CNN); regularization; Dropout

# 1 SAR图像目标识别算法流程

1) 数据输入。输入图像来源于美国国防研究规划局和空军研究实验室联合资助的运动和静止目标获取与识别(MSTAR)数据库。该数据库由10类地面军事目标的SAR图像目标切片组成，每类包含有数百张从不同视角获取的目标雷达图像切片。每张图片的大小为128×128像素。

2) 特征提取。使用卷积层、下采样层和非线性层对输入的SAR图像目标切片进行特征提取。每个卷积层后都跟随一个ReLU非线性层，解决梯度消失的问题，从而加快网络的训练速度。

3) 根据特征进行分类。Softmax分类器输入为特征提取部分所提取的120张大小为1×1像素的特征图。隐含层的神经元个数为120个，使用sigmoid函数作为非线性函数。输出的神经元个数与目标种类的个数相同。使用交叉熵代价函数评价模型的分类结果，并使用批量梯度下降算法(BGD)，进行反向传播，不断地调整优化网络参数，提高目标识别的准确率。

# 2.1 特征提取网络的结构

SAR图像相较于光学图像有巨大的差异，要针对SAR图像的特点设计合理的网络结构。第一，反映目标雷达回波强度的SAR图像是灰度图像，与光学图像相比包含目标的特征信息较少；第二，SAR图像不可避免地存在相干斑噪声；第三，由于SAR图像的分辨率限制，目标的像素大小较小。

Table 1 Parameters and calculations of the size of different convolution kernels

 9×9卷积核 2个级联5×5卷积核 参数个数 81 50 乘法计算量 81($s$-8)2 25($s$-4)2+25($s$-8)2 注：$s$为输入图像的大小。

# 3.1 正则化项

 $L = {L_0} + \frac{\lambda }{{2m}}\sum\limits_i^n {w_i^2}$ (1)

 $y = \sigma (b + \sum\limits_i^n {{w_i}{x_i}} )$ (2)

 $\Delta y = \sigma ({w_i}\Delta {x_i})$ (3)

 $\sum\limits_i^n {{w_i}} = C$ (4)

 $\mathop {{\rm{arg}}\;{\rm{min}}}\limits_w \sum\limits_i^n {w_i^2}$ (5)

 $F\left( {w, \alpha } \right) = \sum\limits_i {w_i^2} + \alpha (\sum\limits_i {{w_i}} - C)$ (6)

 $\mathit{\boldsymbol{z}}_i^{(l + 1)} = w_i^{(l + 1)}{\mathit{\boldsymbol{y}}^{(l)}} + b_i^{(l + 1)}$ (8)

 $\mathit{\boldsymbol{y}}_i^{(l + 1)} = f(\mathit{\boldsymbol{z}}_i^{(l + 1)})$ (9)

 $\mathit{\boldsymbol{r}}_i^{(l)} \sim {\rm{Bernoulli}}\left( p \right)$ (10)

 ${{\mathit{\boldsymbol{\tilde y}}}^{(l)}} = {\mathit{\boldsymbol{r}}^{(l)}} \odot {\mathit{\boldsymbol{y}}^{(l)}}$ (11)

 $\mathit{\boldsymbol{z}}_i^{(l + 1)} = w_i^{(l + 1)}{{\mathit{\boldsymbol{\tilde y}}}^{(l)}} + b_i^{(l + 1)}$ (12)

 $\mathit{\boldsymbol{y}}_i^{(l + 1)} = f(\mathit{\boldsymbol{z}}_i^{(l + 1)})$ (13)

# 4.1.2 数据集

Table 2 The composition of experimental data set

 目标种类 17°(训练) 15°(测试) 总数 2S1 299 274 573 BMP2(9 563) 233 195 428 BMP2(9 566) — 196 196 BMP2(C21) — 196 196 BRDM2 298 274 572 BTR60 256 195 451 BTR70 233 196 429 D7 299 274 573 T62 299 273 572 T72(132) 232 196 428 T72(812) — 195 195 T72(S7) — 191 191 ZIL131 299 274 573 ZSU234 299 274 573 合计 2 747 3 203 5 950 注：表中的—表示没有该数据。

# 4.1.3 卷积神经网络超参数设置

$base$_$lr$×(1 + $gamma$× $iter$) ^ (- $power$)

# 4.1.4 实验过程

10类SAR图像目标的识别，使用MSTAR数据集中俯仰角为17°所获取的目标图像作为训练集，使用BGD来更新网络的参数，每个批次输入的图像为32张，训练集图片共2 725张，迭代约86次完成一轮，迭代设置为10 000次约116轮。训练过程中每迭代86次，在整个训练集进行测试。

# 4.2.1 10类目标识别结果与分析

Table 3 MSTAR target identification confusion matrix

 测试目标 识别结果 准确率/% 2S1 BMP2 BRDM2 BTR60 BTR70 D7 T62 T72 ZIL131 ZSU234 2S1 259 12 1 1 0 0 0 0 1 0 94.53 BMP(9 563) 0 194 0 0 0 0 0 1 0 0 99.48 BMP(9 566) 0 185 0 0 2 0 0 9 0 0 94.39 BMP(C21) 0 184 1 2 0 0 0 9 0 0 93.88 BRDM 0 0 270 0 0 0 0 3 0 1 98.54 BTR60 1 2 3 187 2 0 0 0 0 0 95.90 BTR70 0 0 0 0 196 0 0 0 0 0 100 D7 2 0 0 0 0 272 0 0 0 0 99.27 T62 0 0 1 0 0 0 272 0 0 0 99.63 T72(132) 0 0 0 0 0 0 0 196 0 0 100 T72(812) 0 3 0 0 0 0 0 192 0 0 98.46 T72(S7) 0 2 0 0 1 0 0 188 0 0 98.43 ZIL131 0 0 0 0 0 0 0 0 273 1 99.64 ZSU234 0 0 0 0 0 0 0 0 0 274 100 整体识别率/% 98.10

Table 4 Comparison of MSTAR target recognition results

 测试目标 准确率(含变体)/% 文献[12]方法 本文方法 2S1 93.07 94.54 BMP(9563) 98.97 99.48 BMP(9566) 88.27 94.39 BMP(C21) 85.71 93.88 BRDM2 93.80 98.54 BTR60 97.44 95.90 BTR70 99.49 100 D7 93.43 99.27 T62 94.87 99.63 T72(132) 98.98 100 T72(812) 78.97 98.46 T72(S7) 85.86 98.43 ZIL131 99.64 99.64 ZSU234 99.27 100 整体识别率/% 93.76 98.10

1) 本文的模型在SAR图像目标识别整体识别率达到98.10%，改进后的特征提取网络结构提取了更为有效的目标特征，进而提高了模型的准确率。

2) 本文的模型在SAR图像目标变体的识别方面的准确率也有较大的提高，Softmax分类器的优化增强了模型的泛化性能。

# 4.2.2 优化效果对比实验结果与分析

Table 5 Optimization comparison of experimental results

 9×9卷积核 L2正则化 Dropout 准确率(含变体)/% √ √ √ 93.63~97.06 √ √ 95.91~98.10 √ 95.94~96.19 √ 89.47~96.88 91.04~94.91 注：表中√表示使用相应方法。

# 4.2.3 卷积神经网络噪声抑制结果与分析

Table 6 Filter time and accuracy

 数据集 滤波时间/s 准确率(不含变体)/% 双边滤波 1 638.44 96.70~98.51 Gamma MAP 3 080.65 96.99~98.68 Lee滤波 1 391.21 97.57~99.05 原始图像 0 96.95~99.18

