复杂热红外监控场景下行人检测
Pedestrian detection in complex thermal infrared surveillance scene
- 2018年23卷第12期 页码:1829-1837
收稿:2018-05-10,
修回:2018-7-7,
纸质出版:2018-12-16
DOI: 10.11834/jig.180299
移动端阅览

浏览全部资源
扫码关注微信
收稿:2018-05-10,
修回:2018-7-7,
纸质出版:2018-12-16
移动端阅览
目的
2
复杂热红外监控场景中的行人检测问题是计算机视觉领域的重要研究内容之一,是公共安全、灾难救援以及智慧城市等实际应用中的重要基础任务。现今的热红外行人检测算法大多依据图像中人体目标的灰度值高于场景环境这一假设,导致当环境温度升高热红外图像发生灰度值反转时行人检测率较低。为提高行人检测系统在不同场景中的鲁棒性以及行人目标检测率,提出一种面向热红外监控场景的基于频域显著性检测的全卷积网络行人目标检测算法。
方法
2
该算法首先对热红外图像进行基于频域的显著性检测,生成对行人目标全覆盖的显著图;然后结合热红外原图像生成感兴趣区域图作为输入,以行人目标概率图为输出,搭建全卷积网络;最后,对热红外行人检测系统进行端对端训练,获取网络输出的行人目标概率图,进而实现行人目标检测。
结果
2
论文使用俄亥俄州立大学建立的红外视频数据集OTCBVS中的OSU热红外行人数据库对算法进行验证,与目前5种较为成熟的算法进行对比。实验结果表明,本文算法可以在各种场景中准确检测出行人目标,以MR-FP(丢失率—假阳率)为对比依据,本文算法7%的平均丢失率低于其他算法,具有更高的检测率,对热红外图像中的灰度值反转问题具有更好的鲁棒性。
结论
2
本文提出一种面向热红外监控场景的基于频域显著性检测的全卷积网络行人目标检测算法,在实现检测算法端对端训练的同时,提高了其对各种复杂场景的鲁棒性以及行人目标检测率,提升热红外监控系统中行人目标检测性能。
Objective
2
Pedestrian detection in complex thermal infrared surveillance is an important research topic in the field of computer vision. Pedestrian detection is a crucial task to be conducted in several practical applications
such as public security management
disaster relief
and intelligent surveillance. Existing thermal infrared-based pedestrian detection algorithms are generally composed of two steps. In the first step
several regions of interest (ROI) in thermal infrared imageries that are suspected to be containing human targets are generated. Subsequently
the second step verifies whether the ROI is a human target. The verification can be conducted by processing with a classifier after the extraction of features from the ROIs
and the classification task can be combined with the feature extraction task by adopting a deep learning method. However
most of the existing thermal infrared-based pedestrian detection algorithms remarkably rely on the assumption that the gray value of the human target in the image is higher than the environment in their first step
which renders the algorithms ineffective in dealing with high ambient temperature. The gray value inversion occurs with the increase of ambient temperature
that is
the environmental gray value in the thermal infrared imagery becomes higher than the human target gray value
which reduces the accuracy of the pedestrian detection algorithm. On this basis
a fully convolutional network pedestrian detection algorithm based on frequency domain saliency detection is proposed
which aims to improve the robustness of pedestrian detection systems for thermal infrared surveillance scenes and to achieve better accuracy in pedestrian detection.
Method
2
In the algorithm
a frequency domain-based saliency detection is first employed to generate the saliency map that can cover all pedestrian targets in the original thermal infrared imagery. The difference of the saliency detection-based method from existing methods is that its detection is related to the saliency of human targets rather than the effect of their gray value. Therefore
the generation of the following ROI map in the saliency detection-based method is not limited to the assumption that the gray value of the human target is high
which avoids the inaccuracies in detection caused by the failure of the assumption when ambient temperature is high. In addition
one full-size saliency map is generated in this algorithm rather than several sub-regions. Then
a fully convolutional network is constructed
where the ROI map generated by the saliency map and thermal infrared original imagery is defined as the network input
and the pedestrian target probability map is defined as the network output. The constructed fully convolutional network consists of two parts. The first part mainly refers to AlexNet and VGG network structures
which can be regarded as feature extraction module. The second part is the probability generation module that consists of three deconvolution layers with two size kernels. A sigmoid activation function is used in the last layer to generate the probability map of pedestrian targets
and the remaining layers use the ReLU activation function. The proposed thermal infrared pedestrian detection algorithm is trained to obtain the pedestrian probability map and achieve the detection of pedestrian target.
Result
2
The Ohio State University (OSU) thermal infrared pedestrian database in the infrared video dataset of OTCBVS
which has also been established by OSU
is employed to verify the algorithm
and a comparison between the proposed algorithm and five existing mature algorithms is conducted. A total of 10 sequences are captured from single viewpoint surveillance in the database that covers several weathers
such as sunny
cloudy
and rainy days
which enables the conduct of a comprehensive test on the efficiency of pedestrian detection algorithms. Apart from the methods that are not based on convolutional neural network
the performance of region-based convolutional neural network is plotted. The results show that the proposed algorithm can accurately detect pedestrian targets in various environmental conditions. Furthermore
the several sample results of different pedestrian detections are shown. Taking the miss rate-false positive indicator as a basis for comparison
the proposed algorithm achieves an average miss rate of 7% and performs better than the existing thermal infrared-based pedestrian detection methods and basic deep learning-based object detection methods. The proposed algorithm achieves a high detection rate and shows better robustness in dealing with gray value inversion in thermal infrared imageries. In the detection process
the proposed algorithm can remove the non-pedestrian targets and detect the most pedestrians in thermal imageries
especially when the environment scene is complex
such as the existence of other heat sources (street lights) or at day time.
Conclusion
2
A fully convolutional network pedestrian detection algorithm based on frequency domain saliency detection for thermal infrared surveillance scenes is proposed in this study. In the first step
a saliency detection method
which is robust to gray value inversion when the ambient temperature is high
such as in hot summer or at day time
is employed to generate a full-size ROI map. Subsequently
a fully convolutional network is used to output the probability map of pedestrian targets. The proposed algorithm can be trained and avoids the generation of many sub-regions
which renders it efficient without the requirement of redundant computing and storage space. Experiments are conducted
and the results show that the proposed method achieves an improvement in the robustness of pedestrian detection systems in various complex scenes and obtains a high pedestrian detection rate. The experimental results also verify the capability of the proposed method to enhance the detection of pedestrian targets in thermal infrared surveillance systems.
Ma Y L, Wu X K, Yu G Z, et al. Pedestrian detection and tracking from low-resolution unmanned aerial vehicle thermal imagery[J]. Sensors, 2016, 16(4):#446.[DOI:10.3390/s16040446]
Lee J H, Choi J S, Jeon E S, etal. Robust pedestrian detection by combining visible and thermal infrared cameras[J]. Sensors, 2015, 15(5):10580-10615.[DOI:10.3390/s150510580]
Zhang L, Wu B, Nevatia R. Pedestrian detection in infrared images based on local shape features[C ] //Proceeding of 2007 IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, MN, USA: IEEE, 2007: 1-8.[ DOI: 10.1109/CVPR.2007.383452 http://dx.doi.org/10.1109/CVPR.2007.383452 ]
Biswas S K, Milanfar P. Linear support tensor machine with LSK channels:pedestrian detection in thermal infrared images[J]. IEEE Transactions on Image Processing, 2017, 26(9):4229-4242.[DOI:10.1109/TIP.2017.2705426]
Li J F, Gong W G, Li W H, et al. Robust pedestrian detection in thermal infrared imagery using the wavelet transform[J]. Infrared Physics&Technology, 2010, 53(4):267-273.[DOI:10.1016/j.infrared.2010.03.005]
Qi B, John V, Liu Z, et al. Pedestrian detection from thermal images:a sparse representation based approach[J]. Infrared Physics&Technology, 2016, 76:157-167.[DOI:10.1016/j.infrared.2016.02.004]
Wang J T, Chen D B, Chen H Y, et al. On pedestrian detection and tracking in infrared videos[J]. Pattern Recognition Letters, 2012, 33(6):775-785.[DOI:10.1016/j.patrec.2011.12.011]
Lin C F, Chen C S, Hwang W J, et al. Novel outline features for pedestrian detection system with thermal images[J]. Pattern Recognition, 2015, 48(11):3440-3450.[DOI:10.1016/j.patcog.2015.04.024]
Ostovar A, Hellström T, Ringdahl O. Human detection based on infrared images in forestry environments[C ] //Proceeding of the 13th International Conference on Image Analysis and Recognition. Póvoa de Varzim, Portugal: Springer, 2016.[ DOI: 10.1007/978-3-319-41501-7_20 http://dx.doi.org/10.1007/978-3-319-41501-7_20 ]
Lakshmi A, Faheema A G J, Deodhare D. Pedestrian detection in thermal images:an automated scale based region extraction with curvelet space validation[J]. Infrared Physics&Technology, 2016, 76:421-438.[DOI:10.1016/j.infrared.2016.03.012]
Zhao X Y, He Z X, Zhang S Y, et al. Robust pedestrian detection in thermal infrared imagery using a shape distribution histogram feature and modified sparse representation classification[J]. Pattern Recognition, 2015, 48(6):1947-1960.[DOI:10.1016/j.patcog.2014.12.013]
Budzan S. Human detection in low resolution thermal images based on combined HOG classifier[C ] //Proceeding of the International Conference on Computer Vision and Graphics. Warsaw, Poland: Springer, 2016: 304-315.[ DOI: 10.1007/978-3-319-46418-3_27 http://dx.doi.org/10.1007/978-3-319-46418-3_27 ]
CaiY F, Liu Z, Wang H, et al. Saliency-based pedestrian detection in far infrared images[J]. IEEE Access, 2017, 5:5013-5019.[DOI:10.1109/ACCESS.2017.2695721]
Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[J]. arXiv preprint arXiv: 1311.2524, 2013: 580-587.
Girshick R. Fast R-CNN[C ] //Proceeding of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015: 1440-1448.[ DOI: 0.1109/ICCV.2015.169 http://dx.doi.org/10.1109/ICCV.2015.169 ]
Hou X D, Zhang L Q. Saliency detection: a spectral residual approach[C ] //Proceeding of 2007 IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, MN, USA: IEEE, 2007: 1-8.[ DOI: 10.1109/CVPR.2007.383267 http://dx.doi.org/10.1109/CVPR.2007.383267 ]
Guo C L, Ma Q, Zhang L M. Spatio-temporal saliency detection using phase spectrum of quaternion fourier transform[C ] //Proceeding of 2008 IEEE Conference on Computer Vision and Pattern Recognition. Anchorage, AK, USA: IEEE, 2008: 1-8.[ DOI: 10.1109/CVPR.2008.4587715 http://dx.doi.org/10.1109/CVPR.2008.4587715 ]
Johnson J, Karpathy A, Li F F. DenseCap: fully convolutional localization networks for dense captioning[C ] //Proceeding of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 4565-4574.[ DOI: 10.1109/CVPR.2016.494 http://dx.doi.org/10.1109/CVPR.2016.494 ]
Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C ] //Proceeding of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 3431-3440.[ DOI: 10.1109/CVPR.2015.7298965 http://dx.doi.org/10.1109/CVPR.2015.7298965 ]
Gatys L A, Ecker A S, Bethge M. Image style transfer using convolutional neural networks[C ] //Proceeding of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 2414-2423.[ DOI: 10.1109/CVPR.2016.265 http://dx.doi.org/10.1109/CVPR.2016.265 ]
Johnson J, Alahi A, Li F F. Perceptual losses for real-time style transfer and super-resolution[C ] //Proceeding of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016: 694-711.[ DOI: 10.1007/978-3-319-46475-6_43 http://dx.doi.org/10.1007/978-3-319-46475-6_43 ]
Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada: Curran Associates Inc., 2012: 1097-1105.
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv: 1409.1556, 2014.
Davis J W, Keck M A. A two-stage template approach to person detection in thermal imagery[C ] //Proceeding of the 7th IEEE Workshops on Applications of Computer Vision. Breckenridge, CO, USA: IEEE, 2005: 364-369.[ DOI: 10.1109/ACVMOT.2005.14 http://dx.doi.org/10.1109/ACVMOT.2005.14 ]
相关作者
相关机构
京公网安备11010802024621