Aircraft recognition of remote sensing image based on sample generated by CGAN
- Vol. 26, Issue 3, Pages: 663-673(2021)
Received:08 January 2020,
Revised:2020-6-9,
Accepted:15 June 2020,
Published:16 March 2021
DOI: 10.11834/jig.200001
移动端阅览

浏览全部资源
扫码关注微信
Received:08 January 2020,
Revised:2020-6-9,
Accepted:15 June 2020,
Published:16 March 2021
移动端阅览
目的
2
基于深度学习的飞机目标识别方法在遥感图像解译领域取得了很大进步,但其泛化能力依赖于大规模数据集。条件生成对抗网络(conditional generative adversarial network,CGAN)可用于产生逼真的生成样本以扩充真实数据集,但对复杂遥感场景的建模能力有限,生成样本质量低。针对这些问题,提出了一种结合CGAN样本生成的飞机识别框架。
方法
2
改进条件生成对抗网络,利用感知损失提高生成器对遥感图像的建模能力,提出了基于掩膜的结构相似性(structural similarity,SSIM)度量损失函数(masked-SSIM loss)以提高生成样本中飞机区域的图像质量,该损失函数与飞机的掩膜相结合以保证只作用于图像中的飞机区域而不影响背景区域。选取一个基于残差网络的识别模型,与改进后的生成模型结合,构成飞机识别框架,训练过程中利用生成样本代替真实的卫星图像,降低了对实际卫星数据规模的需求。
结果
2
采用生成样本与真实样本训练的识别模型在真实样本上的进行实验,前者的准确率比后者低0.33%;对于生成模型,在加入感知损失后,生成样本的峰值信噪比(peak signal to noise ratio,PSNR)提高了0.79 dB,SSIM提高了0.094;在加入基于掩膜的结构相似性度量损失函数后,生成样本的PSNR提高了0.09 dB,SSIM提高了0.252。
结论
2
本文提出的基于样本生成的飞机识别框架生成了质量更高的样本,这些样本可以替代真实样本对识别模型进行训练,有效地解决了飞机识别任务中的样本不足问题。
Objective
2
Aircraft type recognition is a fundamental problem in remote sensing image interpretation
which aims to identify the type of aircraft in an image. Aircraft type recognition algorithms have been widely studied and improved ceaselessly. The traditional recognition algorithms are efficient
but their accuracy is limited by the small capacity and poor robustness. The deep-learning-based methods have been widely implemented because of good robustness and generalization
especially in the object recognition task. In remote sensing scenes
the objects are sparsely distributed; hence
the available samples are few. In addition
labeling is time consuming
resulting in a modest number of labeled samples. Generally
the deep-learning-based models rely on a large amount of labeled data due to thousands of weights needed to learn. Consequently
these models suffer from scarce data that are insufficient to meet the demand of large-scale datasets
especially in the remote sensing scene. Generative adversarial network (GAN) can produce realistic synthetic data and enlarge the scale of the real dataset. However
these algorithms usually take random noises as input; therefore
they are unable to control the position
angle
size
and category of objects in synthetic images. Conditional GAN (CGAN) have been proposed by previous researchers to generate synthetic images with designated content in a controlled scheme. CGANs take the pixel-wise labeled images as the input data and output the generated images that meet constraints from its corresponding input images. However
these generative adversarial models have been widely studied for natural sceneries
which are not suitable for remote sensing imageries due to the complex scenes and low resolutions. Hence
the GANs perform poorly when adopted to generate remote sensing images. An aircraft recognition framework of remote sensing images based on sample generation is proposed
which consists of an improved CGAN and a recognition model
to alleviate the lack of real samples and deal with the problems mentioned above.
Method
2
In this framework
the masks of real aircraft images are labeled pixel by pixel. The masks of images serve as the conditions of the CGAN that are trained by the pairs of real aircraft images and corresponding masks. In this manner
the location
scale
and type of aircraft in the synthetic images can be controlled. Perceptual loss is introduced to promote the ability of the CGANs to model the scenes of remote sensing. The L2 distance between the features of real images and synthetic images extracted by the VGG-16 (visual geometry group 16-layer net) network measures the perceptual loss between the real images and synthetic images. Masked structural similarity (SSIM) loss is proposed
which forces the CGAN to focus on the masked region and improve the quality of the aircraft region in the synthetic images. SSIM is a measurement of image quantity according to the structure and texture. Masked SSIM loss is the sum of the product between masks and SSIM pixel by pixel. Afterward
the loss function of the CGAN consists of perceptual loss
masked SSIM loss
and origin CGAN loss. The recognition model in this framework is ResNet-50
which outputs the type and recognition score of an aircraft. In this paper
the recognition model trained on synthetic images is compared with the model trained on real images. The remote sensing images from QuickBird are cropped to build the real dataset
in which 800 images for each type are used for training and 1 000 images are used for testing. After data augmentation
the training dataset consists of 40 000 images
and the synthetic dataset consists of synthetic images generated by the generation module with flipped
rotated
and scaled masks. The generators are selected from different training stages to generate 2 000 synthetic images per type and determine the best end time in the training procedure. The chosen generator is used to produce different numbers of images for 10 aircraft types and find an acceptable number of synthetic images. These synthetic images serve as the training set for the recognition model
whose performances are compared. All our experiments are carried out on a single NVIDIA K80 GPU device with the framework of Pytorch
and the Adam optimizer is implemented to train the CGAN and ResNet-50 for 100 epochs.
Result
2
The quantities of the synthetic images from the generator with and without our proposed loss functions on the training dataset are compared. The quantitative evaluation metrics contain peak signal to noise ratio (PSNR) and SSIM. Results show that PSNR and SSIM increase by 0.88 and 0.346 using our method
respectively. In addition
recognition accuracy increases with the training epoch of the generator and the number of synthetic images. Finally
the accuracy of the recognition model trained on the synthetic dataset is 0.33% less than that of the real dataset.
Conclusion
2
An aircraft recognition framework of remote sensing images based on sample generation is proposed. The experiment results show that our method effectively improves the ability of CGAN to model the remote sensing scenes and alleviates the absence of data.
Fu K, Dai W, Zhang Y, Wang Z R, Yan M L and Sun X. 2019. MultiCAM: multiple class activation mapping for aircraft recognition in remote sensing images. Remote Sensing, 11(5): #544[DOI:10.3390/rs11050544]
Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A and Bengio Y. 2014. Generative adversarial nets//Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press: 2672-2680
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 770-778[ DOI: 10.1109/CVPR.2016.90 http://dx.doi.org/10.1109/CVPR.2016.90 ]
Isola P, Zhu J Y, Zhou T J and Efros A A. 2017. Image-to-Image translation with conditional adversarial networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 5967-5976[ DOI: 10.1109/CVPR.2017.632 http://dx.doi.org/10.1109/CVPR.2017.632 ]
Johnson J, Alahi A and Li F F. 2016. Perceptual losses for real-time style transfer and super-resolution//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer: 694-711[ DOI: 10.1007/978-3-319-46475-6_43 http://dx.doi.org/10.1007/978-3-319-46475-6_43 ]
Kingma P and Ba J. 2017. Adam: a method for stochastic optimization[EB/OL].[2019-12-08] . https://arxiv.org/pdf/1412.6980.pdf https://arxiv.org/pdf/1412.6980.pdf
Mirza M and Osindero S. 2014. Conditional generative adversarial nets[EB/OL].[2019-12-08] . https://arxiv.org/pdf/1411.1784.pdf https://arxiv.org/pdf/1411.1784.pdf
Odena A, Dumoulin V O and Olah C. 2016. Deconvolution and checkerboard artifacts[EB/OL].[2019-12-08] . http://distill.pub/2016/deconv-checkerboard http://distill.pub/2016/deconv-checkerboard
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z M, Gimelshein N, Antiga L, Desmaison A, K pf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J J and Chintala S. 2019. PyTorch: an imperative style, high-performance deep learning library//Proceedings of the 33rd Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates: 8026-8037
Roopa K, Rama Murthy T V and Cyril Prasanna Raj P. 2017. Neural network classifier for fighter aircraft model recognition. Journal of Intelligent Systems, 27(3): 447-463[DOI:10.1515/jisys-2016-0087]
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S A, Huang Z H, Karpathy A, Khosla A, Bernstein M, Berg A C and Li F F. 2015. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 115(3): 211-252[DOI:10.1007/s11263-015-0816-y]
Simonyan K and Zisserman A. 2015. Very deep convolutional networks for large-scale image recognition[EB/OL].[2019-12-08] . https://arxiv.org/pdf/1409.1556.pdf https://arxiv.org/pdf/1409.1556.pdf
Song J, Gao S H, Zhu Y Q and Ma C Y. 2019. A survey of remote sensing image classification based on CNNs. Big Earth Data, 3(3): 232-254[DOI:10.1080/20964471.2019.1657720]
Wang H Z, Gong Y C, Wang Y, Wang L F and Pan C H. 2017. DeepPlane: a unified deep model for aircraft detection and recognition in remote sensing images. Journal of Applied Remote Sensing, 11(4): #042606[DOI:10.1117/1.JRS.11.042606]
Wang T C, Liu M Y, Zhu J Y, Tao A, Kautz J and Catanzaro B. 2018. High-resolution image synthesis and semantic manipulation with conditional GANs//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 8798-8807[ DOI: 10.1109/CVPR.2018.00917 http://dx.doi.org/10.1109/CVPR.2018.00917 ]
Wang Z, Bovik A C, Sheikh H R and Simoncelli E P. 2004. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4): 600-612[DOI:10.1109/TIP.2003.819861]
Zhang Y H, Sun H, Zuo J W, Wang H Q, Xu G L and Sun X. 2018. Aircraft type recognition in remote sensing images based on feature learning with conditional generative adversarial networks. Remote Sensing, 10(7): #1123[DOI:10.3390/rs10071123]
Zhao A, Fu K, Wang S Y, Zuo J W, Zhang Y H, Hu Y F and Wang H Q. 2017. Aircraft recognition based on landmark detection in remote sensing images. IEEE Geoscience and Remote Sensing Letters, 14(8): 1413-1417[DOI:10.1109/LGRS.2017.2715858]
Zuo J W, Xu G L, Fu K, Sun X and Sun H. 2018. Aircraft type recognition based on segmentation with deep convolutional neural networks. IEEE Geoscience and Remote Sensing Letters, 15(2): 282-286[DOI:10.1109/LGRS.2017.2786232]
相关文章
相关作者
相关机构
京公网安备11010802024621