选择性卷积特征融合的花卉图像分类
Flower image classification with selective convolutional descriptor aggregation
- 2019年24卷第5期 页码:762-772
收稿:2018-07-04,
修回:2018-10-16,
纸质出版:2019-05-16
DOI: 10.11834/jig.180426
移动端阅览

浏览全部资源
扫码关注微信
收稿:2018-07-04,
修回:2018-10-16,
纸质出版:2019-05-16
移动端阅览
目的
2
针对花卉图像标注样本缺乏、标注成本高、传统基于深度学习的细粒度图像分类方法无法较好地定位花卉目标区域等问题,提出一种基于选择性深度卷积特征融合的无监督花卉图像分类方法。
方法
2
构建基于选择性深度卷积特征融合的花卉图像分类网络。首先运用保持长宽比的尺寸归一化方法对花卉图像进行预处理,使得图像的尺寸相同,且目标不变形、不丢失图像细节信息;之后运用由ImageNet预训练好的深度卷积神经网络VGG-16模型对预处理的花卉图像进行特征学习,根据特征图的响应值分布选取有效的深度卷积特征,并将多层深度卷积特征进行融合;最后运用softmax分类层进行分类。
结果
2
在Oxford 102 Flowers数据集上做了对比实验,将本文方法与传统的基于深度学习模型的花卉图像分类方法进行对比,本文方法的分类准确率达85.55%,较深度学习模型Xception高27.67%。
结论
2
提出了基于选择性卷积特征融合的花卉图像分类方法,该方法采用无监督的方式定位花卉图像中的显著区域,去除了背景和噪声部分对花卉目标的干扰,提高了花卉图像分类的准确率,适用于处理缺乏带标注的样本时的花卉图像分类问题。
Objective
2
Flower image classification is a fine-grained image classification. Its main challenges are large intra-class differences and inter-class similarities. Different types of flowers have high similarities in morphology
color
and other aspects
whereas flowers in the same category have great diversities in color
shape
and others. According to research and analysis
the current methods of flower image classification can be divided into two categories:methods based on handcrafted features and methods based on deep learning. The former usually obtains flower areas by image segmentation methods and then extracts or designs features manually. Finally
the extracted features are combined with a traditional machine learning algorithm to complete classification. These methods rely on the design experience of researchers. By contrast
methods based on deep learning utilize deep networks to learn the features of flowers automatically. Bounding boxes and part annotations are used to define accurate target positions
and then different convolution neural network models are fine-tuned to obtain the targets' features. Given that currently available flower image datasets lack annotation information
such as bounding box and part annotation
these strongly supervised methods are difficult to apply. Furthermore
tagging many flower images' bounding boxes and part annotations incurs high cost. To solve these problems
this study proposes an unsupervised flower image classification method on the basis of selective convolution descriptor aggregation.
Method
2
A flower image classification network is constructed on the basis of selective deep convolution descriptor aggregation. The proposed method can be divided into four phases:flower image preprocessing
selection and aggregation of convolution features in the Pool5 layer
selection and aggregation of convolution features in the Relu5-2 layer
and multi-layer feature fusion and classification. In the first phase
flower images are preprocessed with the normalization method that retains the aspect ratio to make the size of all flower images equal; thus
the dimension of each flower feature generated by the deep convolutional neural network is consistent. The input image size is set to 224×224 pixels in this study. In the second phase
the features of the preprocessed flower images are learned by VGG-16
which is the deep convolutional neural network model pre-trained by ImageNet. Then
the saliency region is located according to the high response value in the feature map of the Pool5 layer. However
some background regions also have high response values. The area of the background region with a large response value is smaller than the target area. Thus
the flood filling algorithm is used to calculate the maximum connected region of the saliency region. On the basis of the location information of the saliency region
deep convolution features within the region are selected and aggregated to form a low-dimensional feature of flower images. In the third phase
deep convolution features in the Relu5-2 layer are selected and fused to form another low-dimensional feature of flowers. Multi-layer convolution features have been proven to help the network to learn features and then complete the classification task; thus
the deep convolution features in the Pool5 and Relu5-2 layers are chosen in this study. Similarly
a saliency region map from the Relu5-2 layer is obtained on the basis of the response value. The saliency map from the Relu5-2 layer more accurately locates the flower region relative to the saliency map from the Pool5 layer
in which numerous noise regions and few semantic information exist. Thus
the saliency region map from the Relu5-2 layer is combined with a maximum connected region map from the Pool5 layer to produce a true saliency region map with little noise. Finally
deep convolution features are selected and aggregated to form the low-dimensional feature of flower images from the Relu5-2 layer on the basis of the location information of the true saliency region map. In the final phase
the above two low-dimensional features are aggregated to form the final flower features
which are then entered into the softmax layer for classification.
Result
2
To explore the effects of the proposed selective convolution descriptor aggregation method
we perform the following experiment on Oxford 102 Flowers. The preprocessed flower images are entered into the AlexNet
VGG-16
and Xception models
all of which are pre-trained by ImageNet. Experimental results show that the classification accuracy of the proposed method is superior to that of other models. Experiments are also conducted to compare the proposed method and other current flower image classification methods in the literature. Results indicate that the classification accuracy of this method is higher than that of methods based on handcrafted features and other methods based on deep learning.
Conclusion
2
A method for classifying flower images using selective convolution descriptor aggregation was proposed. A flower image's features were learned by using the transfer learning technique on the basis of a pre-trained network. Effective deep convolution features were selected according to the response value distribution in the feature map. Then
multi-layer deep convolution features were fused. Finally
the softmax layer is used for classification. The advantages of this method include locating the conspicuous region in the flower image in an unsupervised manner and selecting deep convolution features in the located region to exclude other invalid parts
such as background and noise parts. Therefore
the accuracy of flower image classification can be improved by reducing the disturbing information from invalid parts.
Nilsback M E, Zisserman A. Automated flower classification over a large number of classes[C ] //Proceedings of the 6th Indian Conference on Computer Vision, Graphics & Image Processing. Bhubaneswar, India: IEEE, 2008: 722-729.[ DOI:10.1109/ICVGIP.2008.47 http://dx.doi.org/10.1109/ICVGIP.2008.47 ]
Angelova A, Zhu S H. Efficient object detection and segmentation for fine-grained recognition[C ] //Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE, 2013: 811-818.[ DOI:10.1109/CVPR.2013.110 http://dx.doi.org/10.1109/CVPR.2013.110 ]
Xie X D, Lyu Y P, Cao D L. Saliency detection based flower image foreground segmentation[EB/OL ] . 2014-09-18[2018-06-20 ] . http://www.paper.edu.cn/releasepaper/content/201409-215 http://www.paper.edu.cn/releasepaper/content/201409-215 .
谢晓东, 吕艳萍, 曹冬林.基于显著性检测的花卉图像分割[EB/OL ] . 2014-09-18[2018-06-20 ] . http://www.paper.edu.cn/releasepaper/content/201409-215 http://www.paper.edu.cn/releasepaper/content/201409-215 .
Xie X D, Lyu Y P, Cao D L. Hierarchical feature fusion based flower image classif ication[EB/OL ] . 2014-10-07[2018-06-20 ] . http://www.paper.edu.cn/releasepaper/content/201410-33 http://www.paper.edu.cn/releasepaper/content/201410-33 .
谢晓东, 吕艳萍, 曹冬林.基于层次化特征融合的花卉图像分类[EB/OL ] . 2014-10-07[2018-06-20 ] . http://www.paper.edu.cn/releasepaper/content/201410-33 http://www.paper.edu.cn/releasepaper/content/201410-33 .
Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada, United States: ACM, 2012: 1097-1105.
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[EB/OL ] .[2018-06-20 ] https://arxiv.org/pdf/1409.1556.pdf https://arxiv.org/pdf/1409.1556.pdf .
Chollet F. Xception: deep learning with depthwise separable convolutions[C ] //Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 1800-1807.[ DOI:10.1109/CVPR.2017.195 http://dx.doi.org/10.1109/CVPR.2017.195 ]
Liu Y Y, Tang F, Zhou D W, et al. Flower classification via convo lutional neural network[C ] //Proceedings of 2016 IEEE International Conference on Functional-Structural Plant Growth Modeling, Simulation, Visualization and Applications. Qingdao: IEEE, 2016: 110-116.[ DOI:10.1109/FSPMA.2016.7818296 http://dx.doi.org/10.1109/FSPMA.2016.7818296 ]
Xia X L, Xu C, Nan B. Inception-v3 for flower classification[C ] //Proceedings of the 2nd International Conference on Image, Vision and Computing. Chengdu: IEEE, 2017: 783-787.[ DOI:10.1109/ICIVC.2017.7984661 http://dx.doi.org/10.1109/ICIVC.2017.7984661 ]
Huang S L, Xu Z, Tao D C, et al. Part-stacked CNN for fine-grained visual categorization[C ] //Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, Nevada: IEEE, 2016: 1173-1182.[ DOI:10.1109/CVPR.2016.132 http://dx.doi.org/10.1109/CVPR.2016.132 ]
Zhang N, Donahue J, Girshick R, et al. Part-Based R-CNNs for fine-grained category detection[C ] //Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland: Springer, 2014: 834-849.[ DOI:10.1007/978-3-319-10590-1_54 http://dx.doi.org/10.1007/978-3-319-10590-1_54 ]
Weng Y C, Tian Y, Lu D M, et al. Fine-grained bird classification based on deep region networks[J]. Journal of Image and Graphics, 2017, 22(11):1521-1531.
翁雨辰, 田野, 路敦民, 等.深度区域网络方法的细粒度图像分类[J].中国图象图形学报, 2017, 22(11):1521-1531. [DOI:10.11834/jig.170262]
Matan O, Burges C J C, Cun Y L, et al. Multi-digit recognition using a space displacement neural network[C]//Advances in Neural Information Processing Systems. Denver, Colorado, USA: NIPS, 1991: 488-495.
Jia Y Q, Shelhamer E, Donahue J, et al. Caffe: Convolutional architecture for fast feature embedding[C ] //Proceedings of the 22nd ACM International Conference on Multimedia. Orlando, Florida, USA: ACM, 2014: 675-678.[ DOI:10.1145/2647868.2654889 http://dx.doi.org/10.1145/2647868.2654889 ]
Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C ] //Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, Ohio: IEEE, 2014: 580-587.[ DOI:10.1109/CVPR.2014.81 http://dx.doi.org/10.1109/CVPR.2014.81 ]
Wei X S, Luo J H, Wu J X et al. Selective convolutional descriptor aggregation for fine-grained image retrieval[J]. IEEE Transactions on Image Processing, 2017, 26(6):2868-2881.[DOI:10.1109/TIP.2017.2688133]
Hariharan B, Arbelaez P, Girshick R, et al. Hypercolumns for object segmentation and fine-grained localization[C ] //Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, Massachusetts: IEEE, 2015: 447-456.[ DOI:10.1109/CVPR.2015.7298642 http://dx.doi.org/10.1109/CVPR.2015.7298642 ]
Lee D, Lin A. Computational complexity of art gallery problems[J]. IEEE Transactions on Information Theory, 1986, 32(2):276-282.[DOI:10.1109/TIT.1986.1057165]
相关作者
相关机构
京公网安备11010802024621