典型的基于区域的形状表示方法比较
Comparative study of classic region-based shape descriptors
- 2018年23卷第8期 页码:1242-1253
收稿:2018-01-08,
修回:2018-3-8,
纸质出版:2018-08-16
DOI: 10.11834/jig.180016
移动端阅览

浏览全部资源
扫码关注微信
收稿:2018-01-08,
修回:2018-3-8,
纸质出版:2018-08-16
移动端阅览
目的
2
形状的表示和匹配是计算机视觉和模式识别领域的重要问题。在基于区域的形状表示方法中出现了一批典型的方法,包括Hu不变矩方法(Hu不变矩)、角径向变换方法(ART方法)、通用傅里叶描述子方法(GFD方法)、拉东柱状图方法(HRT方法)和多尺度积分不变量方法(MSⅡ方法)等。由于这些方法出现的时间跨度长且在以往的对比研究中研究维度单一,因此需要对这些方法的综合性能做一个全面的比较分析和研究,为下一步的理论研究和实际应用提供方向和指导。
方法
2
采用3个基准形状库,包括简单几何图形形状库、MPEG-7形状库和汽车商标形状库,从3个维度,包括检索得分、检索稳定性和方法的计算复杂度,使用加权综合评估模型对典型的基于区域的形状表示方法进行比较分析,综合评估各种方法的综合性能指标。
结果
2
在综合性能上GFD方法具有最优的效果,其次是ART方法;由于HRT方法在匹配计算阶段具有较高的时间复杂度,在大规模形状库匹配的场景下性能会下降;Hu不变矩和MSⅡ方法的实验效果均不理想。通过比较研究还发现,将形状正交投影到正交基函数是提取形状视觉特征的有效方式。进一步猜想,将图像正交投影到正交基函数也是提取图像视觉特征的有效方式。因此,未来的研究中,寻找理想的正交基函数是提取形状乃至图像视觉特征的重要研究方向。
结论
2
在5种比较研究的方法中,GFD方法和ART方法在综合效果要好于HRT方法、Hu不变矩方法和MSⅡ方法,并且寻找理想的正交基函数是未来形状表示的重要研究方向。
Objective
2
Shape representation and shape matching are the basic tasks in computer vision and pattern recognition. Among all the region-based methods
several classic methods are already available
including Hu moment invariants (Hu method)
angular radial transform (ART method)
generic Fourier descriptor (GFD method)
histogram of Radon transform (HRT method)
and multi-scale integral invariant (MSⅡ method). Given the long time spans between all these proposed methods and the fact that only one factor (i.e.
retrieval accuracy) is always compared in the past studies
we need a comprehensive comparative study of all these methods to help in application engineering and in future studies.
Method
2
To compare the different aspects of the five shape descriptors
we utilize three shape databases. The first shape database
which is a group of simple geometry and one that we modified
includes ten seed shapes. From each of these ten seed shapes
we construct three more shapes through non-rigid deformation
with increasing deformation from the first one to the third one
that is
40 basic shapes constructed in total. Finally
for each of these 40 basic shapes
we obtain another three more similar shapes by random scale
random rotation
and random translation. Consequently
160 shapes in the first shape database are constructed. In the retrieval test of the first database
we define a new rule of test scoring
which not only count the retrieval score but also considered the retrieval result order. Therefore
this new test score rule can inspect the intrinsic representation and retrieval ability of the shape descriptor. The second shape database we used in our comparative experiments is the MPEG-7
which consists of 70 different shape categories with each category consisting of 20 shapes of the same category modulo with various rigid and non-rigid deformations and is the standard shape database for shape descriptor and shape retrieval. The experiments are performed on 1400 shape images. Test score for MPEG-7 shape database is based on bullseye score
which counts the number of shapes in the same category (20 shapes in this case) within 40 best matching shapes. The third shape database we used in our experiments is the collection of vehicle trademark shapes. We collect 100 common vehicle flags
such as from Bents
BMW
and Toyota. For each of these 100 vehicle flag shapes
we construct three additional shapes by random scale
random rotation
and random translation and another three shapes using the random perspective parameters. Thus
we obtain a total of 700 vehicle flag shapes. The test score we used for this third database is also based on bullseye score
which counts the number of shapes in the same category (seven shapes in this case) within 14 best matching shapes. In all retrieval experiments for all the three shape databases
we not only compute the test scores but also the retrieval stability using the standard deviations of retrieval scores. We analyze and verified the computation complexity of the compared shape descriptors. After obtaining the test scores
retrieval stability
and computation complexity
a weighted formula
which considers all the three factors
is also defined to measure their comprehensive performances.
Result
2
In the retrieval experiments of the first simple geometry shape database
the HRT method achieves the best test score and lowest standard deviation
with the GFD method following
which indicates that HRT is not sensitive to noise in comparison with the other methods. In the retrieval experiments of the second shape database
ART and GFD methods obtain almost the same retrieval scores. In the retrieval experiments of the third shape database
GFD
ART
and HRT methods almost achieve the same retrieval score. In all of the retrieval experiments
the five compared methods are all invariant to scale
rotation
and translation
which are the fundamental requirements for a shape descriptor. We analyze and verify the computation complexity of the 5 methods and find that in the stage of feature creation
Hu method has the lowest computation complexity
and in the stage of feature matching
except for HRT method
all the other four methods have low matching computation complexity. The comprehensive performances of these five methods are measured by a weighted formula
and the GFD method has the best performance with ART as the next. HRT method can degrade with large number of shapes
because HRT method has higher complexity in matching phase than the other methods. The performances of Hu and MSⅡ methods are not satisfactory in all our experiments. The visual features of a shape can also be captured practically by the method of projecting shape onto a basis of orthogonal base functions. In this study
we suppose that the visual features of an image can also be captured practically by the same projection method.
Conclusion
2
Among all the evaluated region-based methods
GFD and ART methods have the best performance
and we suppose that they can be employed in engineering applications. Finding new basis of orthogonal base functions may be a fruitful research direction in shape visual feature extraction
as well as in image visual feature extraction
because projecting a shape onto the orthogonal base functions can capture its intrinsic vision features.
Belongie S, Malik J, Puzicha J. Shape matching and object recognition using shape contexts[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(4):509-522.[DOI:10.1109/34.993558]
Ling H B, Jacobs D W. Shapeclassification using the inner-distance[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(2):286-299.[DOI:10.1109/TPAMI.2007.41]
Livarinen J, Visa A J E. Shape recognition of irregular objects[C ] //Proceedings of Volume 2904, Intelligent Robots and Computer Vision XV. Boston, MA: SPIE, 1996: 25-32. [ DOI: 10.1117/12.256280 http://dx.doi.org/10.1117/12.256280 ]
Liu Y K, Wei W, Wang P J, et al. Compressed vertex chain codes[J]. Pattern Recognition, 2007, 40(11):2908-2913.[DOI:10.1016/j.patcog.2007.03.001]
Zhang D S, Lu G J. A comparative study of curvature scale space and Fourier descriptors for shape-based image retrieval[J]. Journal of Visual Communication and Image Representation, 2003, 14(1):39-57.[DOI:10.1016/S1047-3203(03)00003-8]
Zhang D S, Lu G J. A comparative study of Fourier descriptors for shape representation and retrieval[C]//Proceedings of the 5th Asian Conference on Computer Vision. Melbourne, Australia: ACCV, 2002: 646-651.
Mokhtarian F, Mackworth A K. A theory of multiscale, curvature-based shape representation for planar curves[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1992, 14(8):789-805.[DOI:10.1109/34.149591]
Wu S G, Wang K, Lu L J, et al. GCT transform and similarity determination of geometry shapes[J]. Journal of Image and Graphics, 2016, 21(12):1671-1684.
吴绍根, 王康, 路利军, 等. GCT变换及几何图形形状相似性判定[J].中国图象图形学报, 2016, 21(12):1671-1684. [DOI:10.11834/jig.20161212]
Yang J Y, Wang H X, Yuan J S, et al. Invariant multi-scale descriptor for shape representation, matching and retrieval[J]. Computer Vision and Image Understanding, 2016, 145:43-58.[DOI:10.1016/j.cviu.2016.01.005]
Presles B, Debayle J. A distance-based shape descriptor invariant to similitude and its application to shape classification[C ] //Proceedings of the 23rd International Conference on Pattern Recognition. Cancun, Mexico: IEEE, 2017: 2598-2603. [ DOI: 10.1109/icpr.2016.7900027 http://dx.doi.org/10.1109/icpr.2016.7900027 ]
Wu S G, Nie W Q, Lu L J, et al. Inside-circle distance transform for shapes[J]. Journal of Image and Graphics, 2018, 23(1):39-51.
吴绍根, 聂为清, 路利军, 等.形状的圆内距离变换[J].中国图象图形学报, 2018, 23(1):39-51. [DOI:10.11834/jig.170362]
Hu M K. Visual pattern recognition by moment invariants[J]. IRE Transactions on Information Theory, 1962, 8(2):179-187.[DOI:10.1109/TIT.1962.1057692]
Taubin G, Cooper D B. Recognition and positioning of rigid objects using algebraic moment invariants[C ] //Proceedings Volume 1570, Geometric Methods in Computer Vision. San Diego, CA: SPIE, 1991: 175-186. [ DOI: 10.1117/12.48423 http://dx.doi.org/10.1117/12.48423 ]
Mukundan R, Ong S H, Lee P A. Image analysis by Tchebichef moments[J]. IEEE Transactions on Image Processing, 2001, 10(9):1357-1364.[DOI:10.1109/83.941859]
Hmimid A, Sayyouri M, Qjidaa H. Image classification using a new set of separable two-dimensional discrete orthogonal invariant moments[J]. Journal of Electronic Imaging, 2014, 23(1):013026.[DOI:10.1117/1.jei.23.1.013026]
Khotanzad A, Hong Y H. Invariant image recognition by Zernike moments[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1990, 12(5):489-497.[DOI:10.1109/34.55109]
Zhu H Q. Image analysis using separable two-dimensional discrete orthogonal moments[C ] //Proceedings of the 18th IEEE International Conference on Image Processing. Brussels, Belgium: IEEE, 2011: 817-820. [ DOI: 10.1109/icip.2011.6116681 http://dx.doi.org/10.1109/icip.2011.6116681 ]
Zhang D S, Lu G J. Shape-based image retrieval using generic Fourier descriptor[J]. Signal Processing:Image Communication, 2002, 17(10):825-848.[DOI:10.1016/S0923-5965(02)00084-X]
Ricard J, Coeurjolly D, Baskurt A. Generalization of angular radial transform[C ] //Proceedings of 20 04 International Conference on Image Processing. Singapore: IEEE, 2004: 2211-2214. [ DOI: 10.1109/icip.2004.1421536 http://dx.doi.org/10.1109/icip.2004.1421536 ]
Khalil M I, Bayoumi M M. A dyadic wavelet affine invariant function for 2d shape recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(10):1152-1164.[DOI:10.1109/34.954605]
Tabbone S, Terrades O R, Barrat S. Histogram of radon transform. A useful descriptor for shape retrieval[C ] //Proceedings of the 19th International Conference on Pattern Recognition. Tampa, FL, USA: IEEE, 2008: 1-4. [ DOI: 10.1109/ICPR.2008.4761555 http://dx.doi.org/10.1109/ICPR.2008.4761555 ]
Santosh K C, Lamiroy B, Wendling L. DTW-radon-based shape descriptor for pattern recognition[J]. International Journal of Pattern Recognition and Artificial Intelligence, 2013, 27(3):1350008.[DOI:10.1142/s0218001413500080]
Hasegawa M, Tabbone S. Amplitude-only log radon transform for geometric invariant shape descriptor[J]. Pattern Recognition, 2014, 47(2):643-658.[DOI:10.1016/j.patcog.2013.07.024]
Hasegawa M, Tabbone S. Histogram of radon transform with angle correlationmatrix for distortion invariant shape descriptor[J]. Neurocomputing, 2016, 173:24-35.[DOI:10.1016/j.neucom.2015.04.100]
Wang B, Gao Y S. Structure integral transform versus radon transform:a 2D mathematical tool for invariant shape recognition[J]. IEEE Transactions on Image Processing, 2016, 25(12):5635-5648.[DOI:10.1109/TIP.2016.2609816]
Manay S, Cremers D, Hong B W, et al. Integral invariants for shape matching[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(10):1602-1618.[DOI:10.1109/TPAMI.2006.208]
Hong B W, Soatto S. Shape matching using multiscale integral invariants[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(1):151-160.[DOI:10.1109/TPAMI.2014.2342215]
Yang M Q, Kidiyo K, Joseph R. A survey of shape feature extraction techniques[J]. Pattern Recognition, 2008:43-90.[DOI:10.5772/6237]
Latecki L J, Lakamper R, Eckhardt T. Shape descriptors for non-rigid shapes with a single closed contour[C ] //Proceedings of 2000 IEEE Conference on Computer Vision and Pattern Recognition . Hilton Head Island, SC, USA: IEEE, 2000: 424-429. [ DOI: 10.1109/cvpr.2000.855850 http://dx.doi.org/10.1109/cvpr.2000.855850 ]
相关作者
相关机构
京公网安备11010802024621