Current Issue Cover
弱监督深度语义分割网络的多源遥感影像水体检测

李鑫伟, 李彦胜, 张永军(武汉大学遥感信息工程学院, 武汉 430079)

摘 要
目的 深度语义分割网络的优良性能高度依赖于大规模和高质量的像素级标签数据。在现实任务中,收集大规模、高质量的像素级水体标签数据将耗费大量人力物力。为了减少标注工作量,本文提出使用已有的公开水体覆盖产品来创建遥感影像对应的水体标签,然而已有的公开水体覆盖产品的空间分辨率低且存在一定错误。对此,提出采用弱监督深度学习方法训练深度语义分割网络。方法 在训练阶段,将原始数据集划分为多个互不重叠的子数据集,分别训练深度语义分割网络,并将训练得到的多个深度语义分割网络协同更新标签,然后利用更新后的标签重复前述过程,重新训练深度语义分割网络,多次迭代后可以获得好的深度语义分割网络。在测试阶段,多源遥感影像经多个代表不同视角的深度语义分割网络分别预测,然后投票产生最后的水体检测结果。结果 为了验证本文方法的有效性,基于原始多源遥感影像数据创建了一个面向水体检测的多源遥感影像数据集,并与基于传统的水体指数阈值分割法和基于低质量水体标签直接学习的深度语义分割网络进行比较,交并比(intersection-over-union,IoU)分别提升了5.5%和7.2%。结论 实验结果表明,本文方法具有收敛性,并且光学影像和合成孔径雷达(synthetic aperture radar,SAR)影像的融合有助于提高水体检测性能。在使用分辨率低、噪声多的水体标签进行训练的情况下,训练所得多视角模型的水体检测精度明显优于基于传统的水体指数阈值分割法和基于低质量水体标签直接学习的深度语义分割网络。
关键词
Weakly supervised deep semantic segmentation network for water body extraction based on multi-source remote sensing imagery

Li Xinwei, Li Yansheng, Zhang Yongjun(School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China)

Abstract
Objective Water body detection has shown important applications in flood disaster assessment, water resource value estimation and ecological environment protection based on remote sensing imagery. Deep semantic segmentation network has achieved great success in the pixel-level remote sensing image classification. Water body detection performance can be reasonably expected based on the deep semantic segmentation network. However, the excellent performance of deep semantic segmentation network is highly dependent on the large-scale and high-quality pixel-level labels. This research paper has intended to leverage the existing open water cover products to create water labels corresponding to remote sensing images in order to reduce the workload of labeling and meantime maintain the fair detection accuracy. The existing open water cover products have a low spatial resolution and contain a certain degree of errors. The noisy low-resolution water labels have inevitably affected the training of deep semantic segmentation network for water body detection. A weakly supervised deep learning method to train deep semantic segmentation network have been taken into consideration to resolve the difficulties. The optimization method to train deep semantic segmentation network using the noisy low-resolution labels for the high accuracy of water detection has been presented based on minimizing the manual annotation cost. Method In the training stage, the original dataset has been divided into several non-overlapped sub-datasets. The deep semantic segmentation network has been trained on each sub-dataset. The trained deep semantic segmentation networks with different sub-datasets have updated the labels simultaneously. As the non-overlapped sub-datasets generally have different data distributions, the detection performance of different networks with different sub-datasets is also complementary. The prediction of the same region by different networks is different, so the multi-perspective deep semantic segmentation network can realize the collaborative update of labels. The updated labels have been used to repeat the above process to re-train new deep semantic segmentation networks. Following each step of iteration, the output of the network has been used as the new labels. The noisy labels have been removed with the iteration process. The range of truth value of the water has also be expanded continuously along with the iteration process. Several good deep semantic segmentation networks can be obtained after a few iterations. In the test stage, the multi-source remote sensing images have been predicted by several deep semantic segmentation networks representing different perspectives and producing the final water detection voting results. Result The multi-source remote sensing image training dataset, validation dataset and testing dataset have been built up for verification. The multi-source remote sensing imagery has composed of Sentinel-1 SAR (synthetic aperture radar) images and Sentinel-2 optical images. The training dataset has contained 150 000 multi-source remote sensing samples with the size of 256×256 pixels. The labels of the training dataset have been intercepted with the public MODIS (moderate-resolution imaging spectroradiometer) water coverage products in geographic scale. The spatial resolution of the training dataset is low and contains massive noise. The validation dataset has contained 100 samples with the size of 256×256 pixels and the testing dataset have contained 400 samples with the size of 256×256 pixels, and the labels from the validation and testing datasets have accurately annotated with the aid of domain experts. The training, validation and testing datasets have not been overlapped each and the dataset can geographically cover in global scale. Experimental results have shown that the proposed method is convergent, and the accuracy tends to be stable based on four iterations. The fusion of optical and SAR images can improve the accuracy of water body detection. The IoU (intersection over union) has increased by 5.5% compared with the traditional water index segmentation method. The IoU has increases by 7.2% compared with the deep semantic segmentation network directly using the noisy low-resolution water labels. Conclusion The experimental results have shown that the current method can converge fast, and the fusion of optical and SAR images can improve the detection results. On the premise of the usage of the noisy low-resolution water labels, the water body detection accuracy of the trained multi-perspective model is obviously better than the traditional water index segmentation method and the deep semantic segmentation network based on the direct learning of the noisy low-resolution water labels. The accuracy of the traditional deep semantic segmentation method is slightly lower than that of the traditional water index method, which indicates that the effectiveness of deep learning highly depends on the quality of the training data labels. The noisy low-resolution water labels have reduced the effect of deep learning. The effect of the proposed method on small rivers and lakes has been analyzed. The accuracy on small rivers and lakes has decreased slightly. The result has still higher than the traditional water index method and the deep learning method with the direct training of the noisy low-resolution water labels.
Keywords

订阅号|日报