Current Issue Cover
基于图像特征的植物形态相似度算法

丁维龙1, 辛卫涛1, 徐志福2, 吴福理1, 高楠1(1.浙江工业大学计算机科学与技术学院, 杭州 310023;2.浙江省农业科学院农业装备研究所, 杭州 310021)

摘 要
目的 研究不同植物形态之间的相似度是有效区分植物种类或科属的一个重要依据。目前的植物形态相似度计算方法,大多只考虑了植物拓扑结构或者外围轮廓等几何形状方面的相似性,而未涉及叶片颜色、冠层叶片的稠密状态及株型的松散状态等因素。因此,基于植物图像的形状特征和颜色特征,本文提出一种基于图像特征的植物形态相似度计算方法。方法 首先,获取图像的轮廓特征和区域特征。轮廓特征用植物枝条的松散程度表示,具体包括植物的高宽比、轮廓四边形和第1个1级侧枝的高度;区域特征用叶片稠密度表示,计算叶片所占整个包围矩形面积的比例。其次,获取图像的颜色特征,使用基于HSV和YUV颜色空间的颜色直方图,统计图像的颜色分布。最后,利用信息熵分析数据的离散程度,据此确定各部分对应的权重大小,加权得到总体的相似度值。结果 实验在人工采集的数据集上进行,得出松散度、稠密度和颜色对应的权重分别为0.62、0.17和0.21。在此基础上得到的相似度计算结果符合实际,可以有效度量植物之间的相似程度。同时,将提出的算法应用于图像检索,并与常见的5种方法进行比较。实验得出该算法查准率都在0.747 7以上。在同一查准率水平下,相比于其他方法,查全率也都处于较高水平。尤其在相似度阈值大于0.8时,查准率可以达到0.910 8以上。另外,该方法对植物图像缩放不敏感,同类植物的相似度依然接近于1。结论 本文提出的植物形态相似度算法,结合了形状特征和颜色特征,计算结果符合人的视觉感受。与其他方法相比,可以更有效区分植物种类或科属。算法主要适用于背景单一的单株植物图像,可为研究植物形态的相似性提供技术参考。
关键词
Plant morphology similarity algorithm based on image features

Ding Weilong1, Xin Weitao1, Xu Zhifu2, Wu Fuli1, Gao Nan1(1.College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, China;2.Institute of Digital Agriculture, Zhejiang Academy of Agriculture Sciences, Hangzhou 310021, China)

Abstract
Objective Plants are one of the common natural landscapes in our daily lives, and they are an important part of natural scene modeling. Different plants have different morphological characteristics. Consequently, measuring the similarity among plants is a key problem to be resolved in the fields of plant classification, variety identification, storage, and retrieval of 3D plant models. The similarity of plant morphology is studied in this work to distinguish plant species effectively. The current method only considers the similarity of geometrical shapes, such as plant topology or edge contour. Plant topology describes the similarity between their structure and the distribution of organs, while edge contour describes the similarity of the edge contour of the plants. However, these methods do not consider intuitive factors, such as the color of the leaves, dense state of the canopy, and loose state of the plant type. This limitation results in a lack of accuracy because geometry and color are the main basis for distinguishing plant species. Therefore, this work proposes a method to calculate plant morphology similarity based on image features, particularly the shape and color characteristics of plant images. Method First, the shape features of an image, which include contour and regional features, are obtained. The contour features are expressed by the looseness of the plant shoots, which include the aspect ratio of the plant, the boundary quadrilateral, and the height of the lowest primary branch. The plant's aspect ratio describes the ratio of overall height to width of a plant. The boundary quadrilateral describes the border of the furthest point of a plant and constrains the plant's morphological distribution. The height of the lowest primary branch describes the starting position of the branch of a plant and claims the position of the canopy. The basic peripheral contour features of a plant can be described with the combination of these factors. The method will calculate the ratio of the height to width of the plant and the four internal angles of the boundary quadrilateral using the boundary points of the plant outline. The lowest primary branch of the plant will then be found, and the proportion of the height of the branch to the height of the entire plant will be calculated. The regional features are reflected in the density of the leaves, and the proportion of the surrounding rectangles of the leaves is calculated. Second, the color features of the images are obtained, in which the color histograms based on the HSV and YUV color spaces are used. The color spaces are divided into seven levels according to hue. Each level is subdivided into five levels according to the gray level. The proportion of each color section of the image is calculated to construct a 2D color matrix. Finally, appropriate weight-setting strategies are needed because the proportion of the various features in the overall similarity is different and the empirical knowledge is lacking. A high degree of data dispersion leads to a large difference among species. Specifically, obvious discrimination of the attribute value results in a considerable influence of the factor on the comprehensive evaluation result and a large corresponding weight. Otherwise, a small corresponding weight will be obtained. The degree of dispersion of the data reflects the uncertainty in the data. In information theory, entropy is used to measure uncertainty. Therefore, information entropy is used to identify the weight of each separate, and the weighting is used to obtain the overall similarity. Result An experiment is conducted on a manually collected data set. After the experiment, the weights of the looseness, density, and color are set to 0.62, 0.17, and 0.21, respectively. In common plant species, the similarity calculation results are in line with reality, which can effectively distinguish plant species and measure the similarity among plants. The proposed algorithm is also applied to image retrieval. The precision of the algorithm is above 0.747 7. Under the same precision level, the recall rate is higher than the results of five ordinary methods. Especially when the similarity threshold is larger than 0.8, the precision can reach 0.910 8 or more, and the recall rate remains higher than that of other methods at the same precision level. The proposed algorithm is insensitive to plant image scaling, and the similarity of similar plants remains close to 1. Conclusion A plant morphology similarity algorithm that combines shape and color features is proposed in this work. The results of plant morphology similarity are in line with those of human visual perception. Compared with other methods, the proposed algorithm can distinguish plant species more effectively. The algorithm is mainly applied to a single plant image with a plain background, which can provide a new idea for studying the similarity of plant morphology.
Keywords

订阅号|日报