张磊, 陈文, 王岳环(华中科技大学人工智能与自动化学院多谱信息处理国家重点实验室, 武汉 430074)
目的 遥感图像中的舰船目标细粒度检测与识别在港口海域监视以及情报搜集等应用中有很高的实际应用价值,但遥感图像中不同种类的舰船目标整体颜色、形状与纹理特征相近,分辨力不足,导致舰船细粒度识别困难。针对该问题,提出了一种端到端的基于关键子区域特征的舰船细粒度检测与识别方法。方法 为了获得更适于目标细粒度识别的特征,提出多层次特征融合识别网络,按照整体、局部子区域两个层次从检测网络得到的候选目标区域中提取特征。然后结合候选目标中所有子区域的信息计算每个子区域的判别性显著度,对含有判别性组件的关键子区域进行挖掘。最后基于判别性显著度将子区域特征与整体特征进行自适应融合,形成表征能力更强的特征,对舰船目标进行细粒度识别。整个检测与识别网络采用端到端一体化设计,所有候选目标特征提取过程只需要经过一次骨干网络的计算,提高了计算效率。结果 在公开的带有细粒度类别标签的 HRSC2016(high resolu-tion ship collection)数据集 L3 任务上,本文方法平均准确率为 77.3%,相较于不采用多层次特征融合识别网络提升了 6.3%;在自建的包含 45 类舰船目标的 FGSAID(fine-grained ships in aerial images dataset)数据集上,本文方法平均准确率为 71.5%。结论 本文方法有效挖掘并融合了含有判别性组件的子区域的特征,解决了目标整体特征分辨力不足导致的细粒度目标识别困难问题,相较于现有的遥感图像舰船目标检测与识别算法准确性有明显提升。
Key sub-region feature fusion network for fine-grained ship detection and recognition in remote sensing images
Zhang Lei, Chen Wen, Wang Yuehuan(National Key Laboratory of Science and Technology on Multi-spectral Information Processing, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China)
Objective The ocean has great economic and military value.The development of human society increases the impact of ocean activities on the development of a country.The sea is an important carrier of marine activities.Thus, the recognition and monitoring of ship targets in key sea areas through remote sensing images are crucial to the national defense and development of the economy.Fine-grained ship detection and recognition in high-resolution remote sensing images refer to the identification of specific types of ships based on ship detection.A precise and detailed classification is valuable in practical application fields, such as sea surveillance and intelligence gathering.Instead of coarse-grained classification categories, such as warcraft and merchant ships, specific ship types, such as Arleigh Burke-class destroyer, Nimitz-class aircraft carrier, container, and car carrier, are necessary.However, the overall color, shape, and texture of different types of ship targets are similar.The structures of ships belong to different types, but their uses are similar.Moreover, the coating color of military ships is monotonous.These characteristics complicate the classification of these targets.The existing ship detectors are designed to focus on locating targets.The design of the classification branch of these detectors is relatively simple.They only use the features of whole targets for classification, significantly decreasing the performance in the fine-grained labeled datasets.The existing ship classification methods, which mainly classify targets on the pre-cropped image patches, are separated from the detection process.This approach is unsatisfactory for practical applications for two reasons:1)the whole backbone of these methods based on neural networks must be executed on every proposal to extract features.The remote sensing images of the harbor usually include several ships;thus, the computation cost increases sharply.2)The detection and classification networks are optimized separately, and the parameters of both networks are optimized to the best.The whole process cannot obtain the optimal solution because the locations of proposals obtained by detection methods vary with the pre-cropped image patches.utilize prior knowledge of ships and propose the key sub-region feature fusion network(KSFFN), which fuses features of sub-regions that are discriminative to the whole feature and combines detection and fine-grained recognition into one framework.Method KSFFN uses ResNet-50 as the backbone network to extract features and construct a proposal locating network by combining Faster R-CNN with region of interest(ROI) Transformer for obtaining proposal locations.Then, all of the proposals are ranked according to the probability of targets.The proposals with low probability are filtered.Then, the multi-level feature fusion recognition network(MLFFRN)is proposed to extract features from the proposals generated by the proposal locating network and to classify the proposals.First, the proposals are separated into several subregions along the axis and the overall features and sub-region features are extracted from different levels of the feature pyramid.Then, the self-supervision mechanism in the navigator-teacherscrutinizer network(NTS-Net)finds the key subregion that may contain important parts contributing to fine-grained recognition.Due to the limitation of image quality and characteristics of the target, not all targets have a very discriminating subregion.Moreover, the self-supervision mechanism in NTS-Net cannot reflect this subregion.Therefore, the information from all subregions in the proposal is utilized to calculate the discriminant significance of the subregion, which reflects the influence of the subregion on target recognition.Based on the discriminant significance, the weight of the sub-region is calculated, and the key sub-region features are fused with the overall features according to the weight.The combined feature is used to obtain the final classification result, thereby improving the accuracy of fine-grained recognition of ship targets.Result Public high resolution ship collection 2016(HRSC2016)dataset L3 task and self-built fine-grained ships in aerial images dataset(FGSAID)are used to evaluate the model.HRSC2016 dataset contains 1 061 images with 2 886 ships divided into 19 types.FGSAID dataset contains 1 690 images with 5 410 ships divided into 45 types.The average precision(AP)is used as an evaluation metric, and the intersection over union is set as 0.5 to determine whether the prediction box matches the ground truth.On the HRSC2016 dataset L3 task, the proposed method achieves an AP of 77.3%, and MLFFRN can improve the AP by 6.3%.On the FGSAID dataset, our method achieves an AP of 71.5%.A series of ablation experiments is conducted on the HRSC2016 dataset L3 task to show the effectiveness of different parts of the proposed method.In addition, the proposed method is compared with the state-of-the-art deep-learning-based ship detection framework on two datasets.The experiment results show that our model outperforms all other methods on both datasets.Compared with single-shot alignment network(S2ANet)network, the proposed method increases the AP by 7.8% and 8.9% on HRSC2016 and FGSAID, respectively.In particular, the AP of the proposed method increases by 16.7%, 11.1%, and 1.1% for the aircraft carrier/amphibious assault ships, other warships, and merchant ships, respectively, in the FGSAID dataset.Conclusion In this study, the end-to-end fine-grained ship detection and recognition network KSFFN is proposed.It extracts the overall features and sub-region features of the proposals and fuses them according to the discriminant significance.The proposed method combines detection and fine-grained recognition into one framework, thereby improving the processing speed greatly while performing excellently.Thus, KSFFN has great application value.The proposed method has a more powerful classification framework and can achieve more accurate results than the existing detection method.The experiment results show that our method outperforms several state-of-the-art deep-learning-based ship detection frameworks, thereby proving the effectiveness of KSFFN.