形状的全尺度可视化表示与识别
Visualized all-scale shape representation and recognition
- 2022年27卷第2期 页码:628-641
纸质出版日期: 2022-02-16 ,
录用日期: 2021-03-10
DOI: 10.11834/jig.200693
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2022-02-16 ,
录用日期: 2021-03-10
移动端阅览
闵睿朋, 李一凡, 黄瑶, 杨剑宇, 钟宝江. 形状的全尺度可视化表示与识别[J]. 中国图象图形学报, 2022,27(2):628-641.
Ruipeng Min, Yifan Li, Yao Huang, Jianyu Yang, Baojiang Zhong. Visualized all-scale shape representation and recognition[J]. Journal of Image and Graphics, 2022,27(2):628-641.
目的
2
视觉目标的形状特征表示和识别是图像领域中的重要问题。在实际应用中,视角、形变、遮挡和噪声等干扰因素造成识别精度较低,且大数据场景需要算法具有较高的学习效率。针对这些问题,本文提出一种全尺度可视化形状表示方法。
方法
2
在尺度空间的所有尺度上对形状轮廓提取形状的不变量特征,获得形状的全尺度特征。将获得的全部特征紧凑地表示为单幅彩色图像,得到形状特征的可视化表示。将表示形状特征的彩色图像输入双路卷积网络模型,完成形状分类和检索任务。
结果
2
通过对原始形状加入旋转、遮挡和噪声等不同干扰的定性实验,验证了本文方法具有旋转和缩放不变性,以及对铰接变换、遮挡和噪声等干扰的鲁棒性。在通用数据集上进行形状分类和形状检索的定量实验,所得准确率在不同数据集上均超过对比算法。在MPEG-7数据集上精度达到99.57%,对比算法的最好结果为98.84%。在铰接和射影变换数据集上皆达到100%的识别精度,而对比算法的最好结果分别为89.75%和95%。
结论
2
本文提出的全尺度可视化形状表示方法,通过一幅彩色图像紧凑地表达了全部形状信息。通过卷积模型既学习了轮廓点间的形状特征关系,又学习了不同尺度间的形状特征关系。本文方法在视角变化、局部遮挡、铰接变形和噪声等干扰下能保持较高的识别正确率,可应用于图像采集干扰较多以及红外或深度图像的目标识别,并适用于大数据场景下的识别任务。
Objective
2
The feature representation of shape contour plays an important role in shape recognition and retrieval tasks
which is an important issue in the field of pattern recognition and image processing. With the increasing application scenarios of big data
deep learning methods are widely used to deal with masses of images for its effectiveness of learning. To use deep learning methods
for example
the popular convolutional neural network for image classification
an image representation of shape features is necessary. Thus
representing the shape features of object contour as an image
rather than a series of feature values
is desired. Moreover
dealing with various disturbance factors and noise
including viewpoint variation
scaling
partial occlusion
articulation
projective transformation
and noise
is unavoidable because different kinds of cameras and sensors are widely used for image and video capturing. These disturbances and noise decrease the quality of the images and videos
and consequently
the accuracy of the following object recognition and retrieval tasks. To solve the above problems
a visualized all-scale shape representation and recognition method is proposed in this work. In our method
the representation of shape features can be learned by the widely used deep learning models
which is effective for recognition and retrieval tasks in big data application scenarios. The proposed method is also robust to various disturbances and noise.
Method
2
First
three kinds of invariant shape features
namely
area feature
arc length feature
and central distance feature
are extracted from the shape contour. The three kinds of shape features are invariant features in different aspects of shape at different dimensions
which are normalized to the size of the shape in the image. The features at all scales in the scale space are extracted to obtain sufficient shape information and fully represent the shape because these three shape features can be extracted at different scales with respect to the shape. After that
all the features in the scale space are compactly represented by a color image. In this image representation
the R
G
and B channels are used to represent the three kinds of invariant shape features. The value of the feature is represented as the value of color. In each channel
the
$$x$$
axis of the image is regarding the sequence of contour points
whereas the
$$y$$
axis is regarding all the scales. A convolution neural network is designed to learn the shape features from the color image because the shape is represented by the color image. To learn as much shape information
the original shape image and the color image representation are used as input of the convolutional model. Thus
the model is designed with two convolutional streams
one for the original image and one for the color image. Therefore
the deep learning method can effectively learn the shape features to perform shape classification and retrieval tasks.
Result
2
In the extensive experimental evaluations
quality experiments and quantity experiments are implemented. Quality experiments are implemented to test the robustness of the proposed method to various disturbances and noise
including rotation
scale variation
partial occlusion
articulated deformation
and noise. In the experiments
each kind of disturbance is added to the shape image
and then the color image representation is compared with that of the original shape image. Experimental results validate that the proposed method is invariant to rotation and scaling
and robust to articulated deformation
partial occlusion
and noise. Furthermore
quantity experiments of shape recognition and retrieval tasks are implemented on the benchmark datasets. The recognition and retrieval accuracy of the proposed method is tested on general datasets
including MPEG-7 dataset and Animal dataset
and the performance of our method under disturbances is evaluated on the articulated shape dataset and projective shape dataset. The recognition and retrieval accuracy of our method is compared with other state-of-the-art methods. Our method outperforms all other methods for shape recognition and retrieval accuracy on all the datasets
which verifies that the proposed shape representation method is effective for shape recognition and retrieval. Furthermore
the accuracy of our method is 99.57% on the MPEG-7 dataset
that is
our method can correctly classify nearly all the shapes. Moreover
in the experiments on the articulated and projective datasets
our method achieves 100% recognition results
which greatly outperform state-of-the-art methods. These evaluations verify that the proposed method can maintain a high accuracy in shape recognition and retrieval tasks under different kinds of disturbances.
Conclusion
2
In this paper
a visualized all-scale shape representation method is proposed for shape recognition and retrieval. Different kinds of invariant shape features can be extracted at all the scales in the scale space
where the shape features are captured as much as possible. The color image representation is compact to represent the extracted shape features
and the shape features can be visualized in this color image. Furthermore
with this color image representation
the effectiveness of deep learning method can be utilized for feature learning and shape classification. The proposed two-stream convolutional neural network can fully learn the shape features from the color image representation and the original binary shape image. Via the deep learning from the color image representation
not only the shape context along the shape contour is learned in the
$$x$$
axis of the color image but also the relations of shape features among different scales are learned in the
$$y$$
axis. The proposed method is robust to various disturbances and noise
and can maintain high recognition accuracy regardless of the influences of viewpoint variation
nonlinear deformation
partial occlusion
and articulated deformation. Therefore
it can be used in complex environments. It can be used for object recognition and retrieval tasks from infrared image and depth image because the shape images are binary images
which can be easily obtained from depth maps. The classification engine is based on the deep learning model
which is also suitable for recognition tasks in big data applications.
形状表示尺度空间不变量形状识别目标识别目标检索
shape representationscale spaceinvarianceshape recognitionobject recognitionobject retrieval
Alajlan N, El Rube I, Kamel M S and Freeman G. 2007. Shape retrieval using triangle-area representation and dynamic space warping. Pattern Recognition, 40(7): 1911-1920[DOI:10.1016/j.patcog.2006.12.005]
Bai X, Liu W Y and Tu Z W. 2009. Integrating contour and skeleton for shape classification//Proceedings of the 12th IEEE International Conference on Computer Vision Workshops. Kyoto, Japan: IEEE: 360-367[DOI: 10.1109/ICCVW.2009.5457679http://dx.doi.org/10.1109/ICCVW.2009.5457679]
Bai X, Rao C and Wang X G. 2014. Shape vocabulary: a robust and efficient shape representation for shape matching. IEEE Transactions on Image Processing, 23(9): 3935-3949[DOI:10.1109/TIP.2014.2336542]
Bai X, Wang B, Yao C, Liu W Y and Tu Z W. 2012. Co-transduction for shape retrieval. IEEE Transactions on Image Processing, 21(5): 2747-2757[DOI:10.1109/TIP.2011.2170082]
Bai X, Yang X W, Latecki L J, Liu W Y and Tu Z W. 2010. Learning context-sensitive shape similarity by graph transduction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(5): 861-874[DOI:10.1109/TPAMI.2009.85]
Belongie S, Malik J and Puzicha J. 2002. Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(4): 509-522[DOI:10.1109/34.993558]
Bi W, Huang W G, Zhang Y P, Gao G Q and Zhu Z K. 2017. Object detection based on salient contour of image. Acta Electronica Sinica, 45(8): 1902-1910
毕威, 黄伟国, 张永萍, 高冠琪, 朱忠奎. 2017. 基于图像显著轮廓的目标检测. 电子学报, 45(8): 1902-1910)[DOI:10.3969/j.issn.0372-2112.2017.08.014]
Bryner D, Klassen E, Le H L and Srivastava A. 2014. 2D affine and projective shape analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(5): 998-1011[DOI:10.1109/TPAMI.2013.199]
Hu R X, Jia W, Ling H B, Zhao Y and Gui J. 2014. Angular pattern and binary angular pattern for shape retrieval. IEEE Transactions on Image Processing, 23(3): 1118-1127[DOI:10.1109/TIP.2013.2286330]
Hu R X, Jia W, Zhao Y and Gui J. 2012. Perceptually motivated morphological strategies for shape retrieval. Pattern Recognition, 45(9): 3222-3230[DOI:10.1016/j.patcog.2012.02.020]
Jia Q, Yu M Y, Fan X, Gao X K and Guo H. 2018. Shape coding and recognition method based on curvature classification. Chinese Journal of Computers, 41(11): 2453-2466
贾棋, 于美玉, 樊鑫, 高新凯, 郭禾. 2018. 基于曲率分级的形状编码及识别方法. 计算机学报, 41(11): 2453-2466)[DOI:10.11897/SP.J.1016.2018.02453]
Latecki L J, Lakamper R and Eckhardt T. 2000. Shape descriptors for non-rigid shapes with a single closed contour//Proceedings of 2000 IEEE Conference on Computer Vision and Pattern Recognition. Hilton Head, USA: IEEE: 424-429[DOI: 10.1109/CVPR.2000.855850http://dx.doi.org/10.1109/CVPR.2000.855850]
Lee S H, Chan C S, Mayo S J and Remagnino P. 2017. How deep learning extracts and learns leaf features for plant classification. Pattern Recognition, 71: 1-13[DOI:10.1016/j.patcog.2017.05.015]
Li Y, Zhu J and Li F L. 2010. A hierarchical shape tree for shape classification//Proceedings of the 25th International Conference of Image and Vision Computing New Zealand. Queenstown, New Zealand: IEEE: 1-6[DOI: 10.1109/IVCNZ.2010.6148820http://dx.doi.org/10.1109/IVCNZ.2010.6148820]
Lim K L and Galoogahi H K. 2010. Shape classification using local and global features//The 4th Pacific-Rim Symposium on Image and Video Technology. Singapore, Singapore: IEEE: 115-120[DOI: 10.1109/PSIVT.2010.26http://dx.doi.org/10.1109/PSIVT.2010.26]
Ling H B and Jacobs D W. 2007. Shape classification using the inner-distance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(2): 286-299[DOI:10.1109/TPAMI.2007.41]
Liu W S, Zheng D C and Han M. 2017. Shape matching method based on improved aspect shape context. Acta Automatica Sinica, 43(10): 1749-1758
刘望舒, 郑丹晨, 韩敏. 2017. 一种基于改进地貌形状上下文的形状匹配方法. 自动化学报, 43(10): 1749-1758)[DOI:10.16383/j.aas.2017.c160302]
Müller M. 2007. Dynamic time warping//Information Retrieval for Music and Motion. Berlin: Springer: 69-84[DOI: 10.1007/978-3-540-74048-3_4http://dx.doi.org/10.1007/978-3-540-74048-3_4]
Pan X Q, Chachada S and Kuo C C J. 2016. A two-stageshape retrieval (TSR) method with global and local features. Journal of Visual Communication and Image Representation, 38: 753-762[DOI:10.1016/j.jvcir.2016.04.021]
Shen W, Du C T, Jiang Y, Zeng D and Zhang Z J. 2018. Bag of shape features with a learned pooling function for shape recognition. Pattern Recognition Letters, 106: 33-40[DOI:10.1016/j.patrec.2018.02.024]
Sun K B and Super B J. 2005. Classification of contour shapes using class segment sets//Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego, USA: IEEE: 727-733[DOI: 10.1109/CVPR.2005.98http://dx.doi.org/10.1109/CVPR.2005.98]
Wang B and Gao Y S. 2014. Hierarchical string cuts: a translation, rotation, scale, and mirror invariant descriptor for fast shape retrieval. IEEE Transactions on Image Processing, 23(9): 4101-4111[DOI:10.1109/TIP.2014.2343457]
Wang J W, Bai X, You X G, Liu W Y and Latecki L J. 2012. Shape matching and classification using height functions. Pattern Recognition Letters, 33(2): 134-143[DOI:10.1016/j.patrec.2011.09.042]
Wang X G, Feng B, Bai X, Liu W Y and Latecki L J. 2014. Bag of contour fragments for robust shape classification. Pattern Recognition, 47(6): 2116-2125[DOI:10.1016/j.patcog.2013.12.008]
Xu H R, Yang J Y, Huang W G and Shang L. 2017. Invariant feature extraction and recognition for shapes. Journal of Image and Graphics, 22(8): 1068-1078
徐浩然, 杨剑宇, 黄伟国, 尚丽. 2017. 形状的不变量特征提取与识别. 中国图象图形学报, 22(8): 1068-1078)[DOI:10.11834/JIG.170080]
Yang C Z, Fang L C and Wei H. 2020. Learning contour-based mid-level representation for shape classification. IEEE Access, 8: 157587-157601[DOI:10.1109/ACCESS.2020.3019800]
Yang J Y, Wang H X, Yuan J S, Li Y F and Liu J Y. 2016. Invariant multi-scale descriptor for shape representation, matching and retrieval. Computer Vision and Image Understanding, 145: 43-58[DOI:10.1016/j.cviu.2016.01.005]
Yang J Y, Zhu C and Yuan J S. 2017. Real time hand gesture recognition via finger-emphasized multi-scale description//Proceedings of 2017 IEEE International Conference on Multimedia and Expo. Hong Kong, China: IEEE: 631-636[DOI: 10.1109/ICME.2017.8019348http://dx.doi.org/10.1109/ICME.2017.8019348]
Yang X W, Bai X, Latecki L J and Tu Z W. 2008. Improving shape retrieval by learning graph transduction//Proceedings of the 10th European Conference on Computer Vision. Marseille, France: Springer: 788-801[DOI: 10.1007/978-3-540-88693-8_58http://dx.doi.org/10.1007/978-3-540-88693-8_58]
Zheng Y, Guo B L, Yan Y Y and He W P. 2019. O2O method for fast 2D shape retrieval. IEEE Transactions on Image Processing, 28(11): 5366-5378[DOI:10.1109/TIP.2019.2919195]
Zheng Y, Meng F J, Liu J, Guo B L, Song Y, Zhang X B and Wang L. 2020. Fourier transform to group feature on generated coarser contours for fast 2D shape matching. IEEE Access, 8: 90141-90152[DOI:10.1109/ACCESS.2020.2994234]
Zhou W, Zhong B J and Yang J Y. 2019. Shape description and retrieval in a fused scale space//Proceedings of the 26th International Conference on Neural Information Processing. Sydney, Australia: Springer: 70-82[DOI: 10.1007/978-3-030-36711-4_7http://dx.doi.org/10.1007/978-3-030-36711-4_7]
Zhou Y, Liu J T and Bai X. 2012. Research and perspective on shape matching. ActaAutomatica Sinica, 38(6): 889-910
周瑜, 刘俊涛, 白翔. 2012. 形状匹配方法研究与展望. 自动化学报, 38(6): 889-910)[DOI:10.3724/SP.J.1004.2012.00889]
Zhu C, Yang J Y, Shao Z P and Liu C P. 2021. Vision based hand gesture recognition using 3D shape context. IEEE/CAA Journal of Automatica Sinica, 8(9): 1600-1613[DOI:10.1109/JAS.2019.1911534]
相关作者
相关机构