多尺度深度特征融合的变化检测
Multiscale deep features fusion for change detection
- 2020年25卷第4期 页码:669-678
收稿:2019-06-27,
修回:2019-10-13,
录用:2019-10-20,
纸质出版:2020-04-16
DOI: 10.11834/jig.190312
移动端阅览

浏览全部资源
扫码关注微信
收稿:2019-06-27,
修回:2019-10-13,
录用:2019-10-20,
纸质出版:2020-04-16
移动端阅览
目的
2
图像的变化检测是视觉领域的一个重要问题,传统的变化检测对光照变化、相机位姿差异过于敏感,使得在真实场景中检测结果较差。鉴于卷积神经网络(convolutional neural networks,CNN)可以提取图像中的深度语义特征,提出一种基于多尺度深度特征融合的变化检测模型,通过提取并融合图像的高级语义特征来克服检测噪音。
方法
2
使用VGG(visual geometry group)16作为网络的基本模型,采用孪生网络结构,分别从参考图像和查询图像中提取不同网络层的深度特征。将两幅图像对应网络层的深度特征拼接后送入一个编码层,通过编码层逐步将高层与低层网络特征进行多尺度融合,充分结合高层的语义和低层的纹理特征,检测出准确的变化区域。使用卷积层对每一个编码层的特征进行运算产生对应尺度的预测结果。将不同尺度的预测结果融合得到进一步细化的检测结果。
结果
2
与SC_SOBS(SC-self-organizing background subtraction)、SuBSENSE(self-balanced sensitivity segmenter)、FGCD(fine-grained change detection)和全卷积网络(fully convolutional network,FCN)4种检测方法进行对比。与性能第2的模型FCN相比,本文方法在VL_CMU_CD(visual localization of Carnegie Mellon University for change detection)数据集中,综合评价指标F1值和精度值分别提高了12.2%和24.4%;在PCD(panoramic change detection)数据集中,F1值和精度值分别提高了2.1%和17.7%;在CDnet(change detection net)数据集中,F1值和精度值分别提高了8.5%和5.8%。
结论
2
本文提出的基于多尺度深度特征融合的变化检测方法,利用卷积神经网络的不同网络层特征,有效克服了光照和相机位姿差异,在不同数据集上均能得到较为鲁棒的变化检测结果。
Objective
2
Change detection aims at detecting the difference of the images captured from same scene in different time observations. This condition is an important research problem in computer vision. However
the traditional change detection methods
which use the handcrafted features and heuristic models
suffer from lighting variations and camera pose differences
resulting in bad change detection results. Recent deep learning-based convolutional neural networks (CNN) achieve huge success on several computer vision problems
such as image classification
semantic segmentation
and saliency detection. The main reason of the success of the deep learning-based methods is the abstract ability of CNN. To conquer the bad effects of lighting variations and camera pose differences
we can employ deep learning-based CNN in change detection problems. Unlike semantic segmentation
change detection inputs image pairs of two time observations. Thus
a key research problem is how to design an effective architecture of CNN
which can fully explore the intrinsic changes of the image pairs. To generate robust change detection results
we propose in this study a multiscale deep feature fusion-based change detection (MDFCD).
Method
2
The proposed MDFCD network has two streams of feature extracting sub-networks
which share weight parameters. Each sub-network is responsible for learning to extract semantic features from the corresponding RGB image. We use VGG(visual geometry group)16 as the basic backbone of the proposed MDFCD. The last fully connected layers of VGG16 are removed to save the spatial resolution of the features of the last convolutional layer. We adopt the features of convolutional blocks Conv3
Conv4
and Conv5 of VGG16 as our multiscale deep features because they can capture high-level
middle-level
and low-level features. Then
the Enc (encoding) module is proposed to fuse the deep features from the two time observations of the same convolutional block. We use "concat" operation to concatenate the features. The resulted features are input into Enc to generate change detection adaptive features at the corresponding feature level. The encoded features from the lower layer are upsampled twice in height and width. Then
we concatenate the deep features of the previous layers' convolutional blocks. Subsequently
Enc is used again to learn adaptive features. By progressively incorporating the features from Conv5 to Conv3
we obtain deep fusion of CNN features at multiple scales. To generate robust change detection
we add a convolutional layer with 2×3×3 convolutional filters to generate change prediction at each scale encoding module. Then
the change predictions of all scales are concatenated together to produce final change detection results. Note that we use bicubic upsampling operation to upsample the change detection map at each scale to the size of the input image.
Result
2
We compared three benchmark datasets
namely
VL_CMU_CD(visual localization of Carnegie Mellon University for change detection)
PCD(panoramic change detection)
CDnet(change detection net)
by the state-of-the-art change detection methods
that is
FGCD(fine-grained change detection)
SC_SOBS(SC-self-organizing background subtraction)
SuBSENSE(self-balanced sensitirity segmenter)
and FCN(fully convolutional network). We employed F1-measure
recall
precision
specific
FPR(false positive rate)
FNR(false negative rate)
PWC(percentage of wrong classification) to evaluate the difference of the compared change detection methods. The experiments show that MDFCD are better than the other compared methods. Among the compared methods
deep learning-based change detection method FCD performs the best. On VL_CMU_CD
the F1-measure and Precision of MDFCD achieve 12.2% and 24.4% relative improvements over the second-placed change detection method FCN
respectively. On PCD
the F1-measure and precision of MDFCD obtain 2.1% and 17.7% relative improvements over FCN
respectively. On CDnet
compared with FCN
our F1-measure and precision achieve 8.5% and 5.8% relative improvements
respectively. From the experiments
we can find that MDFCD can detect the fine grained changes
such as telegraph poles. The proposed MDFCD are better in distinguishing the real changes with false changes caused by lighting variations and camera pose difference compared with FCN.
Conclusion
2
We studied how to effectively explore the deep convolutional neural networks for change detection problem. MDFCD network is proposed to alleviate the bad effects introduced by lighting variations and camera pose differences. The proposed method adopts a siamese network with VGG16 as the backbone. Each path is responsible for extracting deep features from reference and query images. We also proposed encoding module that fuses multiscale deep convolutional features and learn change detection adaptive features. The deep features are integrated together from high layers' semantic features with low layers' texture features. With this fusion strategy
the proposed method can generate more robust change detection results than other compared methods. The high layers' semantic features can effectively avoid the negative changes caused by lighting and season change. Meanwhile
the low layers' texture features help the proposed method obtain accurate changes at the object boundaries. Compared with deep learning method
FCN
where in the input is concatenate reference and query images
our method of extracting features with respect to each image can extract representatively features for change detection. However
as a general problem of deep learning-based methods
one should use large volume of training images to train CNNs. Another problem is that the present change detection methods pay considerable attention on region changes but not on object-level changes. In our future work
we plan to use weak supervised and unsupervised method to study the change detection to avoid using pixel-level labeled training images. We also plan to study incorporating object detection in change detection to generate object-level changes.
Alcantarilla P F, Stent S, Ros G, Arroyo R and Gherardi R. 2018. Street-view change detection with deconvolutional networks. Autonomous Robots, 42(7):1301-1322[DOI:10.1007/s10514-018-9734-5]
Badino H, Huber D and Kanade T. 2011. Visual topometric localization//2011 IEEE Intelligent Vehicles Symposium (Ⅳ). Baden-Baden, Germany: IEEE: 794-799[ DOI: 10.1109/IVS.2011.5940504 http://dx.doi.org/10.1109/IVS.2011.5940504 ]
Bromley J, Guyon I, LeCun Y, Säckinger E and Shah R. 1993. Signature verification using a "Siamese" time delay neural network//Proceedings of the 6th International Conference on Neural Information Processing Systems. Denver, Colorado: Morgan Kaufmann Publishers Inc: 737-744
Bruzzone L, Cossu R and Vernazza G. 2004. Detection of landcover transitions by combining multidate classifiers. Pattern Recognition Letters, 25(13):1491-1500[DOI:10.1016/j.patrec.2004.06.002]
Deng J S, Wang K, Deng Y H and Qi G J. 2008. PCA-based landuse change detection and analysis using multitemporal and multisensor satellite data. International Journal of Remote Sensing, 29(16):4823-4838[DOI:10.1080/01431160801950162]
Feng W, Tian F P, Zhang Q, Zhang N, Wan L and Sun J Z. 2015. Fine-grained change detection of misaligned scenes with varied illuminations//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 1260-1268[ DOI: 10.1109/ICCV.2015.149 http://dx.doi.org/10.1109/ICCV.2015.149 ]
Fujita A, Sakurada K, Imaizumi T, Ito R, Hikosaka S and Nakamura R. 2017. Damage detection from aerial images via convolutional neural networks//Proceedings of the 15th IAPR International Conference on Machine Vision Applications (MVA). Nagoya, Japan: IEEE: 5-8[ DOI: 10.23919/MVA.2017.7986759 http://dx.doi.org/10.23919/MVA.2017.7986759 ]
Ghosh S, Mishra N S and Ghosh A. 2009. Unsupervised change detection of remotely sensed images using fuzzy clustering//Proceedings of the 7th International Conference on Advances in Pattern Recognition. Kolkata, India: IEEE: 385-388[ DOI: 10.1109/ICAPR.2009.82 http://dx.doi.org/10.1109/ICAPR.2009.82 ]
Gong M G, Niu X D, Zhang P Z and Li Z T. 2017. Generative adversarial networks for change detection in multispectral imagery. IEEE Geoscience and Remote Sensing Letters, 14(12):2310-2314[DOI:10.1109/LGRS.2017.2762694]
Huang R, Feng W, Wang Z Z, Fan M Y, Wan L and Sun J Z. 2017. Learning to detect fine-grained change under variant imaging conditions//Proceedings of 2017 IEEE International Conference on Computer Vision Workshops. Venice, Italy: IEEE: 2916-2924[ DOI: 10.1109/ICCVW.2017.344 http://dx.doi.org/10.1109/ICCVW.2017.344 ]
Khan S H, He X M, Porikli F and Bennamoun M. 2017. Forest change detection in incomplete satellite images with deep neural networks. IEEE Transactions on Geoscience and Remote Sensing, 55(9):5407-5423[DOI:10.1109/TGRS.2017.2707528]
Krizhevsky A, Sutskever I and Hinton G E. 2012. ImageNet classification with deep convolutional neural networks//Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada: Curran Associates Inc: 1097-1105
Li X and Yeh A G O. 1998. Principal component analysis of stacked multi-temporal images for the monitoring of rapid urban expansion in the Pearl River Delta. International Journal of Remote Sensing, 19(8):1501-1518[DOI:10.1080/014311698215315]
Liang D, Kaneko S I, Sun H and Kang B. 2017. Adaptive local spatial modeling for online change detection under abrupt dynamic background//Proceedings of 2017 IEEE International Conference on Image Processing (ICIP). Beijing, China: IEEE: 2020-2024[ DOI: 10.1109/ICIP.2017.8296636 http://dx.doi.org/10.1109/ICIP.2017.8296636 ]
Liu H P, Li J M, Hu X L and Sun F C. 2013. Recent progress in detection and recognition of the traffic signs in dynamic scenes. Journal of Image and Graphics, 18(5):493-503
刘华平, 李建民, 胡晓林, 孙富春. 2013.动态场景下的交通标识检测与识别研究进展.中国图象图形学报, 18(5):493-503[DOI:10.11834/jig.20130502]
Lyu H and Lu H. 2016. Learning a transferable change detection method by recurrent neural network//Proceedings of 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS). Beijing, China: IEEE: 5157-5160[ DOI: 10.1109/IGARSS.2016.7730344 http://dx.doi.org/10.1109/IGARSS.2016.7730344 ]
Maddalena L and Petrosino A. 2012. The SOBS algorithm: what are the limits?//Proceedings of 2012 IEEE Computer Society Conference on Computer Vision and PatternRecognition Workshops. Providence, RI, USA: IEEE: 21-26[ DOI: 10.1109/CVPRW.2012.6238922 http://dx.doi.org/10.1109/CVPRW.2012.6238922 ]
Malila W A. 1980. Change vector analysis: an approach for detecting forest changes with Landsat//Remotely Sensed Data Symposium. West Lafayette: Purdue University: 326-335
Mou L C, Bruzzone L and Zhu X X. 2019. Learning spectral-spatial-temporal features via a recurrent convolutional neural network for change detection in multispectral imagery. IEEE Transactions on Geoscience and Remote Sensing, 57(2):924-935[DOI:10.1109/TGRS.2018.2863224]
Nair V and Hinton G E. 2010. Rectified linear units improve restricted Boltzmann machines//Proceedings of the 27th International Conference on International Conference on Machine Learning. Haifa, Israel: Omnipress: 807-814
Sakurada K and Okatani T. 2015. Change detection from a street image pair using CNN features and superpixel segmentation//Proceedings of the British Machine Vision Conference (BMVC). Swansea, UK: BMVC: 61.1-61.12[ DOI: 10.5244/C.29.61 http://dx.doi.org/10.5244/C.29.61 ]
Shelhamer E, Long J and Darrell T. 2014. Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4):640-651[DOI:10.1109/TPAMI.2016.2572683]
Simonyan K and Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition[EB/OL].[2019-06-12]. https: //arxiv.org/pdf/1409.1556.pdf
St-Charles P L, Bilodeau G A and Bergevin R. 2015. SuBSENSE:a universal change detection method with local adaptive sensitivity. IEEE Transactions on Image Processing, 24(1):359-373[DOI:10.1109/TIP.2014.2378053]
Szegedy C, Ioffe S, Vanhoucke V and Alemi A A. 2017. Inception-v4, inception-ResNet and the impact of residual connections on learning//Proceedings of the 31st AAAI Conference on Artificial Intelligence. San Francisco, California, USA: AAAI: 4278-4284
Taneja A, Ballan L and Pollefeys M. 2011. Image based detection of geometric changes in urban environments//Proceedings of 2011 International Conference on Computer Vision. Barcelona, Spain: IEEE: 2336-2343[ DOI: 10.1109/ICCV.2011.6126515 http://dx.doi.org/10.1109/ICCV.2011.6126515 ]
Wang Y, Jodoin P M, Porikli F, Konrad J, Benezeth Y and Ishwar P. 2014. CDnet 2014: an expanded change detection benchmark dataset//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Columbus, OH, USA: IEEE: 393-400[ DOI: 10.1109/CVPRW.2014.126 http://dx.doi.org/10.1109/CVPRW.2014.126 ]
Wu M, An X J and He H G. 2007. On vision-based lane departure detection approach. Journal of Image and Graphics, 12(1):110-115
吴沫, 安向京, 贺汉根. 2007.基于视觉的车道跑偏检测方法研究及仿真.中国图象图形学报, 12(1):110-115[DOI:10.3969/j.issn.1006-8961.2007.01.019]
Zagoruyko S and Komodakis N. 2015. Learning to compare image patches via convolutional neural networks.//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, Massachusetts. IEEE: 4353-4361[ DOI: 10.1109/CVPR.2015.7299064 http://dx.doi.org/10.1109/CVPR.2015.7299064 ]
相关作者
相关机构
京公网安备11010802024621