参数合成空间变换网络的遥感图像一致性配准

陈颖; 张祺; 李文举; 石艳娇; 陈磊

发布时间： 2021-12-18
摘要点击次数： 1239
全文下载次数： 947
DOI: 10.11834/jig.200587
2021 | Volume 26 | Number 12

参数合成空间变换网络的遥感图像一致性配准

陈颖, 张祺, 李文举, 石艳娇, 陈磊(上海应用技术大学计算机科学与信息工程学院, 上海 201418)

摘要

目的遥感图像配准是对多组图像进行匹配和叠加的过程。该技术在地物检测、航空图像分类和卫星图像融合等方面发挥着重要作用，主要有传统方法和基于深度学习的方法。其中，传统遥感图像配准算法在进行配准时会耗费大量人力，并且运行时间过长。而基于深度学习的遥感图像配准算法虽然减少了人工成本，提高了模型自适应学习的能力，但是算法的配准精度和运行时间仍有待提高。针对基于深度学习的配准算法存在的问题，本文提出了参数合成的空间变换网络对遥感图像进行双向一致性配准。方法通过增加空间变换网络的深度、合成网络内部的参数对空间变换模型进行改进，并将改进后的模型作为特征提取部分的骨干网络，有效地提高网络的鲁棒性。同时，将单向配准方法改为双向配准方法，进行双向的特征匹配和特征回归，保证配准方向的一致性。然后将回归得到的双向参数加权合成，提高模型的可靠性和准确性。结果将本文实验结果与两种经典的传统方法SIFT（scale-invariant feature transform）、SURF（speeded up robust features）对比，同时与近3年提出的CNNGeo（convolutional neural network architecture for geometric matching）、CNN-Registration（multi-temporal remote sensing image registration）和RMNet（robust matching network）3种最新的方法对比，配准结果表明本文方法不仅在定性的视觉效果上较为优异，而且在定量的评估指标上也有不错的效果。在Aerial Image Dataset数据集上，本文使用"关键点正确评估比例"与以上5种方法对比，精度分别提高了36.2%、75.9%、53.6%、29.9%和1.7%；配准时间分别降低了9.24 s、7.16 s、48.29 s、1.06 s和4.06 s。结论本文所提出的配准方法适用于时间差异变化（多时相）、视角差异（多视角）与拍摄传感器不同（多模态）的3种类型的遥感图像配准应用。在这3种类型的配准应用下，本文算法具有较高的配准精度和配准效率。

关键词

图像处理遥感图像配准空间变换网络(STN) 参数合成双向一致性

Consistent registration of remote sensing images in parametric synthesized spatial transformation network

Chen Ying, Zhang Qi, Li Wenju, Shi Yanjiao, Chen Lei(College of Computer Science and Information Engineering, Shanghai Institute of Technology, Shanghai 201418, China)

Abstract

Objective Remote sensing image registration is a process of matching and superimposing multiple sets of images. It plays an important role in many fields such as climate change, urban change and crustal movement. Currently, most remote sensing registration methods can be generally divided into two categories:traditional based methods and deep learning based methods. The traditional remote sensing registration algorithms can be labor-cost and weaken adaptive learning to cause time-consuming registration. Even though the remote sensing image registration algorithms based on deep learning reduce the labor cost and improve the ability of model adaptive learning, the accuracy and the running time still need to be improved. A parametric synthesized spatial transformation network has been proposed that can be probably used for bidirectional consistent registration of remote sensing images. Methods An end-to-end method is proposed for registration, which mainly includes feature extraction, feature matching and parameter regression. First, the feature extraction network has been designated based on the spatial transformation network model:the local network in the context of spatial transformation network has been more deepening via jumping connection. Four sets of full convolution modules are added, each of which is composed of four full convolution layers. Meanwhile, every two sets of four full convolution layers in each module are connected based on the same jumping connection structure. In order to ensure the integrity of data transmission, the beginning and the ending of each module are connected by jumping structure as well. Then two parameters have been synthesized which are regressed by local network. Following the process of grid generator and sampler, the input images are transformed to generate two saliency images with the same region based on affine transformation. Thus, fine-tuning residual structure has been used for feature extraction to obtain the targeted feature map. Next, a feature matching structure is designed to conduct bidirectional consistent matching. A matching branch is added to obtain the correlation from the source image to the target image and the correlation originated from the target image to the source image via Pearson correlation coefficient. The parameter regression network with two regression parameters have been leaked out based on the regression of matching relationship in two directions to maintain the consistency of registration. At last, the grid loss function has been iterated in consistency. The optimized bidirectional consistency parameters have been calculated via weighted and synthesized regression. The final registration is completed after sampling. Result The experimental results have been compared with two classical methods, which are scale-invariant feature transform (SIFT) and speeded up robust features(SURF).Simultaneously the latest methods proposed in recent three years have been compared as well, such as convolutional neural network architecture for geometric matching (CNNGeo), CNN-Registration (multi-temporal remote sensing image registration) and robust matching network (RMNet). Registration results have illustrated that our research is qualified in qualitative visual effects and has good results in quantitative evaluation indexes. Based on the Aerial Image Dataset, "the percentage of correct key points" compared with the above five methods have been implemented, and the accuracy is increased by 36.2%, 75.9%, 53.6%, 29.9% and 1.7%, respectively. Registration time is reduced by 9.24 s, 7.16 s, 48.29 s, 1.06 s and 4.06 s. Since the gap between CNNGeo method, RMNet method and the method proposed, it cannot be clearly identified via the percentage of correct keypoints(PCK) evaluation index, the grid loss and the average grid loss for further comparison. Compared with the above two methods, the grid loss in this research has been increased by 3.48% and 2.66%, the average grid loss has been increased by 2.67% and 0.2% respectively. The gradient of the research method and RMNet method has decreased fastest in the grid loss line chart and average grid loss line chart. It has demonstrated that the accuracy of proposed method is higher via the histogram comparison between the method proposed and the RMNet method. The improved feature extraction network has been used to replace the feature extraction network in the CNNGeo method, and the PCK index is increased by 4.6% compared with the original benchmark network (CNNGeo). The improved matching relationship is replaced via the matching relationship in the CNNGeo method.The PCK index is improved by 3.9% compared with the original benchmark network. Bidirectional parameter of weighted synthesis has been further improved. The PCK index is increased by 14.1% compared with the original benchmark network. The experimental results have shown that the method proposed has its advantages in accuracy and efficient operation. Conclusion The registration method is applicable for three types of remote sensing image registration applications, such as temporal variation (multi-temporal), visual diversity (multi-viewpoints) and different sensors (multi-modal). The proposed algorithm has illustrated more qualified registration accuracy and registration efficiency.

Keywords

image processing remote sensing image registration spatial transformation network(STN) parameter synthesis bidirectional consistency

在线采编平台

论文出版

年度会议

下载中心

年度信息