形状的全尺度可视化表示与识别

闵睿朋; 李一凡; 黄瑶; 杨剑宇; 钟宝江

doi:10.11834/jig.200693

图像视频分析 | 浏览量 : 0 下载量: 0 CSCD: 2

PDF
导出
分享
收藏
专辑

形状的全尺度可视化表示与识别
Visualized all-scale shape representation and recognition
2022年27卷第2期页码：628-641
纸质出版日期： 2022-02-16 ，

录用日期： 2021-03-10
DOI： 10.11834/jig.200693
稿件说明：

移动端阅览

闵睿朋, 李一凡, 黄瑶, 杨剑宇, 钟宝江. 形状的全尺度可视化表示与识别[J]. 中国图象图形学报, 2022,27(2):628-641.

Ruipeng Min, Yifan Li, Yao Huang, Jianyu Yang, Baojiang Zhong. Visualized all-scale shape representation and recognition[J]. Journal of Image and Graphics, 2022,27(2):628-641.
闵睿朋, 李一凡, 黄瑶, 杨剑宇, 钟宝江. 形状的全尺度可视化表示与识别[J]. 中国图象图形学报, 2022,27(2):628-641. DOI： 10.11834/jig.200693.

Ruipeng Min, Yifan Li, Yao Huang, Jianyu Yang, Baojiang Zhong. Visualized all-scale shape representation and recognition[J]. Journal of Image and Graphics, 2022,27(2):628-641. DOI： 10.11834/jig.200693.

摘要

目的

视觉目标的形状特征表示和识别是图像领域中的重要问题。在实际应用中，视角、形变、遮挡和噪声等干扰因素造成识别精度较低，且大数据场景需要算法具有较高的学习效率。针对这些问题，本文提出一种全尺度可视化形状表示方法。

方法

在尺度空间的所有尺度上对形状轮廓提取形状的不变量特征，获得形状的全尺度特征。将获得的全部特征紧凑地表示为单幅彩色图像，得到形状特征的可视化表示。将表示形状特征的彩色图像输入双路卷积网络模型，完成形状分类和检索任务。

结果

通过对原始形状加入旋转、遮挡和噪声等不同干扰的定性实验，验证了本文方法具有旋转和缩放不变性，以及对铰接变换、遮挡和噪声等干扰的鲁棒性。在通用数据集上进行形状分类和形状检索的定量实验，所得准确率在不同数据集上均超过对比算法。在MPEG-7数据集上精度达到99.57%，对比算法的最好结果为98.84%。在铰接和射影变换数据集上皆达到100%的识别精度，而对比算法的最好结果分别为89.75%和95%。

结论

本文提出的全尺度可视化形状表示方法，通过一幅彩色图像紧凑地表达了全部形状信息。通过卷积模型既学习了轮廓点间的形状特征关系，又学习了不同尺度间的形状特征关系。本文方法在视角变化、局部遮挡、铰接变形和噪声等干扰下能保持较高的识别正确率，可应用于图像采集干扰较多以及红外或深度图像的目标识别，并适用于大数据场景下的识别任务。

Abstract

Objective

The feature representation of shape contour plays an important role in shape recognition and retrieval tasks

which is an important issue in the field of pattern recognition and image processing. With the increasing application scenarios of big data

deep learning methods are widely used to deal with masses of images for its effectiveness of learning. To use deep learning methods

for example

the popular convolutional neural network for image classification

an image representation of shape features is necessary. Thus

representing the shape features of object contour as an image

rather than a series of feature values

is desired. Moreover

dealing with various disturbance factors and noise

including viewpoint variation

scaling

partial occlusion

articulation

projective transformation

and noise

is unavoidable because different kinds of cameras and sensors are widely used for image and video capturing. These disturbances and noise decrease the quality of the images and videos

and consequently

the accuracy of the following object recognition and retrieval tasks. To solve the above problems

a visualized all-scale shape representation and recognition method is proposed in this work. In our method

the representation of shape features can be learned by the widely used deep learning models

which is effective for recognition and retrieval tasks in big data application scenarios. The proposed method is also robust to various disturbances and noise.

Method

First

three kinds of invariant shape features

namely

area feature

arc length feature

and central distance feature

are extracted from the shape contour. The three kinds of shape features are invariant features in different aspects of shape at different dimensions

which are normalized to the size of the shape in the image. The features at all scales in the scale space are extracted to obtain sufficient shape information and fully represent the shape because these three shape features can be extracted at different scales with respect to the shape. After that

all the features in the scale space are compactly represented by a color image. In this image representation

the R

and B channels are used to represent the three kinds of invariant shape features. The value of the feature is represented as the value of color. In each channel

the

$$x$$

axis of the image is regarding the sequence of contour points

whereas the

$$y$$

axis is regarding all the scales. A convolution neural network is designed to learn the shape features from the color image because the shape is represented by the color image. To learn as much shape information

the original shape image and the color image representation are used as input of the convolutional model. Thus

the model is designed with two convolutional streams

one for the original image and one for the color image. Therefore

the deep learning method can effectively learn the shape features to perform shape classification and retrieval tasks.

Result

In the extensive experimental evaluations

quality experiments and quantity experiments are implemented. Quality experiments are implemented to test the robustness of the proposed method to various disturbances and noise

including rotation

scale variation

partial occlusion

articulated deformation

and noise. In the experiments

each kind of disturbance is added to the shape image

and then the color image representation is compared with that of the original shape image. Experimental results validate that the proposed method is invariant to rotation and scaling

and robust to articulated deformation

partial occlusion

and noise. Furthermore

quantity experiments of shape recognition and retrieval tasks are implemented on the benchmark datasets. The recognition and retrieval accuracy of the proposed method is tested on general datasets

including MPEG-7 dataset and Animal dataset

and the performance of our method under disturbances is evaluated on the articulated shape dataset and projective shape dataset. The recognition and retrieval accuracy of our method is compared with other state-of-the-art methods. Our method outperforms all other methods for shape recognition and retrieval accuracy on all the datasets

which verifies that the proposed shape representation method is effective for shape recognition and retrieval. Furthermore

the accuracy of our method is 99.57% on the MPEG-7 dataset

that is

our method can correctly classify nearly all the shapes. Moreover

in the experiments on the articulated and projective datasets

our method achieves 100% recognition results

which greatly outperform state-of-the-art methods. These evaluations verify that the proposed method can maintain a high accuracy in shape recognition and retrieval tasks under different kinds of disturbances.

Conclusion

In this paper

a visualized all-scale shape representation method is proposed for shape recognition and retrieval. Different kinds of invariant shape features can be extracted at all the scales in the scale space

where the shape features are captured as much as possible. The color image representation is compact to represent the extracted shape features

and the shape features can be visualized in this color image. Furthermore

with this color image representation

the effectiveness of deep learning method can be utilized for feature learning and shape classification. The proposed two-stream convolutional neural network can fully learn the shape features from the color image representation and the original binary shape image. Via the deep learning from the color image representation

not only the shape context along the shape contour is learned in the

$$x$$

axis of the color image but also the relations of shape features among different scales are learned in the

$$y$$

axis. The proposed method is robust to various disturbances and noise

and can maintain high recognition accuracy regardless of the influences of viewpoint variation

nonlinear deformation

partial occlusion

and articulated deformation. Therefore

it can be used in complex environments. It can be used for object recognition and retrieval tasks from infrared image and depth image because the shape images are binary images

which can be easily obtained from depth maps. The classification engine is based on the deep learning model

which is also suitable for recognition tasks in big data applications.

关键词

形状表示尺度空间不变量形状识别目标识别目标检索

Keywords

shape representationscale spaceinvarianceshape recognitionobject recognitionobject retrieval

references

Alajlan N, El Rube I, Kamel M S and Freeman G. 2007. Shape retrieval using triangle-area representation and dynamic space warping. Pattern Recognition, 40(7): 1911-1920[DOI:10.1016/j.patcog.2006.12.005]

Bai X, Liu W Y and Tu Z W. 2009. Integrating contour and skeleton for shape classification//Proceedings of the 12th IEEE International Conference on Computer Vision Workshops. Kyoto, Japan: IEEE: 360-367[DOI: 10.1109/ICCVW.2009.5457679http://dx.doi.org/10.1109/ICCVW.2009.5457679]

Bai X, Rao C and Wang X G. 2014. Shape vocabulary: a robust and efficient shape representation for shape matching. IEEE Transactions on Image Processing, 23(9): 3935-3949[DOI:10.1109/TIP.2014.2336542]

Bai X, Wang B, Yao C, Liu W Y and Tu Z W. 2012. Co-transduction for shape retrieval. IEEE Transactions on Image Processing, 21(5): 2747-2757[DOI:10.1109/TIP.2011.2170082]

Bai X, Yang X W, Latecki L J, Liu W Y and Tu Z W. 2010. Learning context-sensitive shape similarity by graph transduction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(5): 861-874[DOI:10.1109/TPAMI.2009.85]

Belongie S, Malik J and Puzicha J. 2002. Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(4): 509-522[DOI:10.1109/34.993558]

Bi W, Huang W G, Zhang Y P, Gao G Q and Zhu Z K. 2017. Object detection based on salient contour of image. Acta Electronica Sinica, 45(8): 1902-1910

毕威, 黄伟国, 张永萍, 高冠琪, 朱忠奎. 2017. 基于图像显著轮廓的目标检测. 电子学报, 45(8): 1902-1910)[DOI:10.3969/j.issn.0372-2112.2017.08.014]

Bryner D, Klassen E, Le H L and Srivastava A. 2014. 2D affine and projective shape analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(5): 998-1011[DOI:10.1109/TPAMI.2013.199]

Hu R X, Jia W, Ling H B, Zhao Y and Gui J. 2014. Angular pattern and binary angular pattern for shape retrieval. IEEE Transactions on Image Processing, 23(3): 1118-1127[DOI:10.1109/TIP.2013.2286330]

Hu R X, Jia W, Zhao Y and Gui J. 2012. Perceptually motivated morphological strategies for shape retrieval. Pattern Recognition, 45(9): 3222-3230[DOI:10.1016/j.patcog.2012.02.020]

Jia Q, Yu M Y, Fan X, Gao X K and Guo H. 2018. Shape coding and recognition method based on curvature classification. Chinese Journal of Computers, 41(11): 2453-2466

贾棋, 于美玉, 樊鑫, 高新凯, 郭禾. 2018. 基于曲率分级的形状编码及识别方法. 计算机学报, 41(11): 2453-2466)[DOI:10.11897/SP.J.1016.2018.02453]

Latecki L J, Lakamper R and Eckhardt T. 2000. Shape descriptors for non-rigid shapes with a single closed contour//Proceedings of 2000 IEEE Conference on Computer Vision and Pattern Recognition. Hilton Head, USA: IEEE: 424-429[DOI: 10.1109/CVPR.2000.855850http://dx.doi.org/10.1109/CVPR.2000.855850]

Lee S H, Chan C S, Mayo S J and Remagnino P. 2017. How deep learning extracts and learns leaf features for plant classification. Pattern Recognition, 71: 1-13[DOI:10.1016/j.patcog.2017.05.015]

Li Y, Zhu J and Li F L. 2010. A hierarchical shape tree for shape classification//Proceedings of the 25th International Conference of Image and Vision Computing New Zealand. Queenstown, New Zealand: IEEE: 1-6[DOI: 10.1109/IVCNZ.2010.6148820http://dx.doi.org/10.1109/IVCNZ.2010.6148820]

Lim K L and Galoogahi H K. 2010. Shape classification using local and global features//The 4th Pacific-Rim Symposium on Image and Video Technology. Singapore, Singapore: IEEE: 115-120[DOI: 10.1109/PSIVT.2010.26http://dx.doi.org/10.1109/PSIVT.2010.26]

Ling H B and Jacobs D W. 2007. Shape classification using the inner-distance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(2): 286-299[DOI:10.1109/TPAMI.2007.41]

Liu W S, Zheng D C and Han M. 2017. Shape matching method based on improved aspect shape context. Acta Automatica Sinica, 43(10): 1749-1758

刘望舒, 郑丹晨, 韩敏. 2017. 一种基于改进地貌形状上下文的形状匹配方法. 自动化学报, 43(10): 1749-1758)[DOI:10.16383/j.aas.2017.c160302]

Müller M. 2007. Dynamic time warping//Information Retrieval for Music and Motion. Berlin: Springer: 69-84[DOI: 10.1007/978-3-540-74048-3_4http://dx.doi.org/10.1007/978-3-540-74048-3_4]

Pan X Q, Chachada S and Kuo C C J. 2016. A two-stageshape retrieval (TSR) method with global and local features. Journal of Visual Communication and Image Representation, 38: 753-762[DOI:10.1016/j.jvcir.2016.04.021]

Shen W, Du C T, Jiang Y, Zeng D and Zhang Z J. 2018. Bag of shape features with a learned pooling function for shape recognition. Pattern Recognition Letters, 106: 33-40[DOI:10.1016/j.patrec.2018.02.024]

Sun K B and Super B J. 2005. Classification of contour shapes using class segment sets//Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego, USA: IEEE: 727-733[DOI: 10.1109/CVPR.2005.98http://dx.doi.org/10.1109/CVPR.2005.98]

Wang B and Gao Y S. 2014. Hierarchical string cuts: a translation, rotation, scale, and mirror invariant descriptor for fast shape retrieval. IEEE Transactions on Image Processing, 23(9): 4101-4111[DOI:10.1109/TIP.2014.2343457]

Wang J W, Bai X, You X G, Liu W Y and Latecki L J. 2012. Shape matching and classification using height functions. Pattern Recognition Letters, 33(2): 134-143[DOI:10.1016/j.patrec.2011.09.042]

Wang X G, Feng B, Bai X, Liu W Y and Latecki L J. 2014. Bag of contour fragments for robust shape classification. Pattern Recognition, 47(6): 2116-2125[DOI:10.1016/j.patcog.2013.12.008]

Xu H R, Yang J Y, Huang W G and Shang L. 2017. Invariant feature extraction and recognition for shapes. Journal of Image and Graphics, 22(8): 1068-1078

徐浩然, 杨剑宇, 黄伟国, 尚丽. 2017. 形状的不变量特征提取与识别. 中国图象图形学报, 22(8): 1068-1078)[DOI:10.11834/JIG.170080]

Yang C Z, Fang L C and Wei H. 2020. Learning contour-based mid-level representation for shape classification. IEEE Access, 8: 157587-157601[DOI:10.1109/ACCESS.2020.3019800]

Yang J Y, Wang H X, Yuan J S, Li Y F and Liu J Y. 2016. Invariant multi-scale descriptor for shape representation, matching and retrieval. Computer Vision and Image Understanding, 145: 43-58[DOI:10.1016/j.cviu.2016.01.005]

Yang J Y, Zhu C and Yuan J S. 2017. Real time hand gesture recognition via finger-emphasized multi-scale description//Proceedings of 2017 IEEE International Conference on Multimedia and Expo. Hong Kong, China: IEEE: 631-636[DOI: 10.1109/ICME.2017.8019348http://dx.doi.org/10.1109/ICME.2017.8019348]

Yang X W, Bai X, Latecki L J and Tu Z W. 2008. Improving shape retrieval by learning graph transduction//Proceedings of the 10th European Conference on Computer Vision. Marseille, France: Springer: 788-801[DOI: 10.1007/978-3-540-88693-8_58http://dx.doi.org/10.1007/978-3-540-88693-8_58]

Zheng Y, Guo B L, Yan Y Y and He W P. 2019. O2O method for fast 2D shape retrieval. IEEE Transactions on Image Processing, 28(11): 5366-5378[DOI:10.1109/TIP.2019.2919195]

Zheng Y, Meng F J, Liu J, Guo B L, Song Y, Zhang X B and Wang L. 2020. Fourier transform to group feature on generated coarser contours for fast 2D shape matching. IEEE Access, 8: 90141-90152[DOI:10.1109/ACCESS.2020.2994234]

Zhou W, Zhong B J and Yang J Y. 2019. Shape description and retrieval in a fused scale space//Proceedings of the 26th International Conference on Neural Information Processing. Sydney, Australia: Springer: 70-82[DOI: 10.1007/978-3-030-36711-4_7http://dx.doi.org/10.1007/978-3-030-36711-4_7]

Zhou Y, Liu J T and Bai X. 2012. Research and perspective on shape matching. ActaAutomatica Sinica, 38(6): 889-910

周瑜, 刘俊涛, 白翔. 2012. 形状匹配方法研究与展望. 自动化学报, 38(6): 889-910)[DOI:10.3724/SP.J.1004.2012.00889]

Zhu C, Yang J Y, Shao Z P and Liu C P. 2021. Vision based hand gesture recognition using 3D shape context. IEEE/CAA Journal of Automatica Sinica, 8(9): 1600-1613[DOI:10.1109/JAS.2019.1911534]

文章被引用时，请邮件提醒。

提交

CGAN样本生成的遥感图像飞机识别

基于子形心集Hausdorf 距离的平面形状识别新方法

一类几何分形的表示模型

一种基于尺度空间理论的直线抽取算法

基于视觉特征的尺度空间信息量度量