结合上下文编码与特征融合的SAR图像分割
The integrated contextual encoding and feature fusion SAR images segmentation method
- 2022年27卷第8期 页码:2527-2536
收稿:2021-02-04,
修回:2021-4-19,
录用:2021-4-26,
纸质出版:2022-08-16
DOI: 10.11834/jig.210056
移动端阅览

浏览全部资源
扫码关注微信
收稿:2021-02-04,
修回:2021-4-19,
录用:2021-4-26,
纸质出版:2022-08-16
移动端阅览
目的
2
图像分割的中心任务是寻找更强大的特征表示,而合成孔径雷达(synthetic aperture radar,SAR)图像中斑点噪声阻碍特征提取。为加强对SAR图像特征的提取以及对特征充分利用,提出一种改进的全卷积分割网络。
方法
2
该网络遵循编码器—解码器结构,主要包括上下文编码模块和特征融合模块两部分。上下文编码模块(contextual encoder module,CEM)通过捕获局部上下文和通道上下文信息增强对图像的特征提取;特征融合模块(feature fusion module,FFM)提取高层特征中的全局上下文信息,将其嵌入低层特征,然后将增强的低层特征并入解码网络,提升特征图分辨率恢复的准确性。
结果
2
在两幅真实SAR图像上,采用5种基于全卷积神经网络的分割算法作为对比,并对CEM与CEM-FFM分别进行实验。结果显示,该网络分割结果的总体精度(overall accuracy,OA)、平均精度(average accuracy,AA)与Kappa系数比5种先进算法均有显著提升。其中,网络在OA上表现最好,CEM在两幅SAR图像上OA分别为91.082%和90.903%,较对比算法中性能最优者分别提高了0.948%和0.941%,证实了CEM的有效性。而CEM-FFM在CEM基础上又将结果分别提高了2.149%和2.390%,验证了FFM的有效性。
结论
2
本文提出的分割网络较其他方法对图像具有更强大的特征提取能力,且能更好地将低层特征中的空间信息与高层特征中的语义信息融合为一体,使得网络对特征的表征能力更强、图像分割结果更准确。
Objective
2
Pixel-wise segmentation for synthetic aperture radar (SAR) images has been challenging due to the constraints of labeled SAR data
as well as the coherent speckle contextual information. Current semantic segmentation is challenged like existing algorithms as mentioned below: First
the ability to capture contextual information is insufficient. Some algorithms ignore contextual information or just focus on local spatial contextual information derived of a few pixels
and lack global spatial contextual information. Second
in order to improve the network performance
researchers are committed to developing the spatial dimension and ignoring the relationship between channels. Third
a neural network based high-level features extracted from the late layers are rich in semantic information and have blurred spatial details. A network based low-level features extraction contains more noise pixel-level information from the early layers. They are isolated from each other
so it is difficult to make full use of them. The most common ways are not efficient based on concatenate them or per-pixel addition.
Method
2
To solve these problems
a segmentation algorithm is proposed based on fully convolutional neural network (CNN). The whole network is based on the structure of encoder-decoder network. Our research facilitates a contextual encoding module and a feature fusion module for feature extraction and feature fusion. The different rates and channel attention mechanism based contextual encoding module consists of a residual connection
a standard convolution
two dilated convolutions. Among them
the residual connection is designed to neglect network degradation issues. Standard convolution is obtained by local features with 3 × 3 convolution kernel. After convolution
batch normalization and nonlinear activation function ReLU are connected to resist over-fitting. Dilated convolutions with 2 × 2 and 3 × 3 dilated rates extend the perception field and capture multi-scale features and local contextual features further. The channel attention mechanism learns the importance of each feature channel
enhances useful features in terms of this importance
inhibits features
and completes the modeling of the dependency between channels to obtain the context information of channels. First
the feature fusion module based global context features extraction is promoted
the in the high-level features. Specifically
the global average pooling suppresses each feature to a real number
which has a global perception field to some extent. Then
these numbers are embedding into the low-level features. The enhanced low-level features are transmitted to the decoding network
which can improve the effectiveness of up sampling. This module can greatly enhance its semantic representation with no the spatial information of low-level features loss
and improve the effectiveness of their integration. Our research carries out four contextual encoding modules and two feature fusion modules are stacked in the whole network.
Result
2
We demonstrated seven experimental schemes. In the first scheme
contextual encoder module (CEM) is used as the encoder block only; In the second scheme
we combined the CEM and the feature fusion module (FFM); the rest of them are five related methods like SegNet
U-Net
pyramid scene parsing network (PSPNet)
FCN-DK3 and context-aware encoder network(CAEN). Our two real SAR images experiments contain a wealth of information scene experiment are Radarsat-2 Flevoland (RS2-Flevoland) and Radarsat-2 San-Francisco-Bay (RS2-SF-Bay). The option of overall accuracy (OA)
average accuracy (AA) and Kappa coefficient is as the evaluation criteria. The OA of the CEM algorithm on the two real SAR images is 91.082% and 90.903% respectively in comparison to the five advanced algorithms mentioned above. The CEM-FFM algorithm increased 2.149% and 2.390% compare to CEM algorithm.
Conclusion
2
Our illustration designs a CNN based semantic segmentation algorithm. It is composed of two aspects of contextual encoding module and feature fusion module. The experiments have their priorities of the proposed method with other related algorithms. Our proposed segmentation network has stronger feature extraction ability
and integrates low-level features and high-level features greatly
which improves the feature representation ability of the stable network and more accurate results of image segmentation.
Badrinarayanan V, Kendall A, Cipolla R. 2017. SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12): 2481-2495 [DOI: 10.1109/TPAMI.2016.2644615]
Ding L, Tang H, Bruzzone L. 2019. Improving semantic segmentation of aerial images using patch-based attention [EB/OL]. [2021-01-16].
He K M, Zhang X Y, Ren S Q, Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 770-778
Hu J, Shen L, Sun G. 2018. Squeeze-and-excitation networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 7132-7141
Liang W K, Wu Y, Li M, Cao Y C. 2020. High-resolution SAR image classification using context-aware encoder network and hybrid conditional random field model. IEEE Transactions on Geoscience and Remote Sensing, 58(8): 5317-5335 [DOI: 10.1109/TGRS.2019.2963699]
Lin M, Chen Q, Yan S C. 2014. Network in network [EB/OL ] . [2021-01-16 ] . https://arxiv.org/pdf/1312.4400.pdf https://arxiv.org/pdf/1312.4400.pdf
LongJ, Shelhamer E, Darrell T. 2015. Fully convolutional networks for semantic segmentation//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 3431-3440
Ma M, Liang J H, Guo M, Fan Y, Yin Y L. 2011. SAR image segmentation based on artificial bee colony algorithm. Applied Soft Computing, 11(8): 5205-5214 [DOI: 10.1016/j.asoc.2011.05.039]
Moreira A, Prats-Iraola P, Younis M, Krieger G, Hajnsek I, Papathanassiou K P. 2013. A tutorial on synthetic aperture radar. IEEE Geoscience and Remote Sensing Magazine, 1(1): 6-43 [DOI: 10.1109/MGRS.2013.2248301]
Mullissa A G, Persello C, Tolpekin V. 2018. Fully convolutional networks for multi-temporal SAR image classification//IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium. Valencia, Spain: IEEE: 6635-6638
Ronneberger O, Fischer P, Brox T. 2015. U-Net: convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer Assisted Intervention. Munich, Germany: Springer: 234-241
Soh L K, Tsatsoulis C. 1999. Texture analysis of SAR sea ice imagery using gray level co-occurrence matrices. IEEE Transactions on Geoscience and Remote Sensing, 37(2): 780-795 [DOI: 10.1109/36.752194]
Song W Y, Li M, Zhang P, Wu Y, Jia L, An L. 2017. Unsupervised PolSAR image classification and segmentation using Dirichlet process mixture model and Markov random fields with similarity measure. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 10(8): 3556-3568 [DOI: 10.1109/JSTARS.2017.2684301]
Wang F, Wu Y, Li M, Zhang P, Zhang Q J. 2017. Adaptive hybrid conditional random field model for SAR image segmentation. IEEE Transactions on Geoscience and Remote Sensing, 55(1): 537-550 [DOI: 10.1109/TGRS.2016.2611060]
Xu K W, Yang X Z, Ai J Q, Zhang A J. 2019. Research on SAR image classification based on point feature similarity and convolutional neural network. Geography and Geo-information Science, 35(3): 28-36
许开炜, 杨学志, 艾加秋, 张安骏. 2019. 点特征相似与卷积神经网络相结合的SAR图像分类算法研究. 地理与地理信息科学, 35(3): 28-36 [DOI: 10.3969/j.issn.1672-0504.2019.03.005]
Yu F, Koltun V. 2016. Multi-scale context aggregation by dilated convolutions [EB/OL ] . [2021-01-16 ] . https://arxiv.org/pdf/1511.07122.pdf https://arxiv.org/pdf/1511.07122.pdf
Zhai P B, Yang H, Song T T, Yu K, Ma L X, Huang X S. 2020. Two-path semantic segmentation algorithm combining attention mechanism. Journal of Image and Graphics, 25(8): 1627-1636
翟鹏博, 杨浩, 宋婷婷, 余亢, 马龙祥, 黄向生. 2020. 结合注意力机制的双路径语义分割. 中国图象图形学报, 25(8): 1627-1636 [DOI: 10.11834/jig.190533]
Zhang N, Li J, Li Y R, Du Y. 2019. Global attention pyramid network for semantic segmentation//Proceedings of 2019 Chinese Control Conference (CCC). Guangzhou, China: IEEE: 8728-8732
Zhang Z M, Wang H P, Xu F, Jin Y Q. 2017. Complex-valued convolutional neural network and its application in polarimetric SAR image classification. IEEE Transactions on Geoscience and Remote Sensing, 55(12): 7177-7188 [DOI: 10.1109/TGRS.2017.2743222]
Zhao H S, Shi J P, Qi X J, Wang X G, Jia J Y. 2017. Pyramid scene parsing network//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Hawaii, USA: IEEE: 6230-6239
Zhou Y, Wang H P, Xu F, Jin Y Q. 2016. Polarimetric SAR image classification using deep convolutional neural networks. IEEE Geoscience and Remote Sensing Letters, 13(12): 1935-1939 [DOI: 10.1109/LGRS.2016.2618840]
相关作者
相关机构
京公网安备11010802024621