王丽冉,汤一平,陈朋,何霞,袁公萍(浙江工业大学信息工程学院, 杭州 310023;浙江工业大学信息工程学院, 杭州 310023;浙江银江研究院有限公司, 杭州 310000)
目的 由于舌体与周围组织颜色相似，轮廓模糊，传统的分割方法难以精准分割舌体，为此提出一种基于两阶段卷积神经网络的舌体分割方法。方法 首先，在粗分割阶段，将卷积层和全连接层相结合构建网络Rsnet，采用区域建议策略得到舌体候选框，从候选框中进一步确定舌体，从而实现对舌体的定位，去除大量的干扰信息；然后，在精分割阶段，将卷积层与反卷积层相结合构建网络Fsnet，对粗分割舌象中的每一个像素点进行分类进而实现精分割；最后，采用形态学相关算法对精分割后的舌体图像进行后续处理，进一步消除噪点和边缘粗糙点。结果 本文构建了包含2 764张舌象的数据集，在该数据集上进行五折交叉实验。实验结果表明，本文算法能够取得较为理想的分割结果且具有较快的处理速度。选取了精确度、召回率及F值作为评价标准，与3种常用的传统分割方法相比，在综合指标F值上分别提高了0.58、0.34、0.12，效率上至少提高6倍，与同样基于深度学习思想的MNC（multi-task network cascades）算法相比，在F值上提高0.17，效率上提高1.9倍。结论 将基于深度学习的方法应用到舌体分割中，有利于实现舌象的准确、鲁棒、快速分割。在分割之前，先对舌体进行定位，有助于进一步减少分割中的错分与漏分。实验结果表明，本文算法有效提升了舌体分割的准确性，能够为后续的舌象自动识别和分析奠定坚实的基础。
Two-phase convolutional neural network design for tongue segmentation
Wang Liran,Tang Yiping,Chen Peng,He Xia,Yuan Gongping(School of Information Engineering Zhejiang University of Technology, Hangzhou 310023, China;School of Information Engineering Zhejiang University of Technology, Hangzhou 310023, China;Zhejiang Enjoyor Research Institute Co., Ltd, Hangzhou 310000, China)
Objective The tongue is difficult to segment accurately due to the blurred contours and the similar colors of the surrounding tissue.Current tongue segmentation methods,whether based on texture analysis,edge detection,or threshold segmentation,mostly extract the color features of the tongue image,i.e.,they are pixel-based segmentation methods.Although color features are easy to extract,the position information of the target is difficult to express.The color between the tongue and the background is similar,and the color features of the tongue and the background may overlap.Therefore,tongue information is difficult to express by using the color features of the tongue image.The deep semantic information of the image should be extracted and more complete features must be obtained to achieve an accurate segmentation of the tongue body.A tongue segmentation method based on two-stage convolutional neural network is proposed in this paper.The cascade method is used to combine the networks,and the output of the previous stage is taken as the input of the next stage.Method First,in the rough segmentation stage,the rough segmentation network (Rsnet) consists of the convolutional and fully connected layers.The problem of excessive interference information in the original tongue image needs to be solved.Thus,the region suggestion strategy is adopted to obtain tongue candidate boxes,and the regions of interest are extracted from the similar background,i.e.,the tongue is located and a large amount of interference information are removed.Therefore,the influence of the tissue around the tongue during the segmentation of the tongue is weakened.Second,in the fine segmentation phase,the fine segmentation network (Fsnet) consists of the convolutional and deconvolutional layers.The regions of interest obtained in the previous stage are taken as the input to the Fsnet.The Softmax classifier is automatically trained and learned without manual intervention.With the trained Softmax classifier,each pixel of the image is classified to achieve fine segmentation and obtain a more accurate tongue image.Finally,the designed algorithm performs post-processing on the finely divided tongue image.The morphology-related algorithm is used to deal with the fine-segmented tongue image,and can further eliminate noise and edge roughness.Therefore,the segmentation result is further optimized.In addition,the training of deep convolutional neural network depends on many samples.The collection and labeling of medical images are difficult.Consequently,large-scale tongue image datasets are difficult to obtain.When a small-scale dataset is used for direct training,the network is not easy to converge;moreover,overfitting can occur easily.The desired results are difficult to achieve.In the training process,three aspects are considered,namely,training strategy,network structure,and dataset,to avoid the overfitting of models.Result In this study,a database of 2 764 tongue images is constructed,and the five-cross experiment is performed on this database.Experimental results show that the proposed algorithm can achieve better segmentation results and faster processing.Accuracy,recall rate,and F-measure are selected as the evaluation criteria.As opposed to the three common traditional segmentation methods,the proposed method can increase the comprehensive F-measure by 0.58,0.34,and 0.12 and the efficiency by at least 6 times.Moreover,as opposed to the MNC algorithm based on deep learning,the F-measure can be increased by 0.17 while efficiency can be increased by 1.9 times.Conclusion The method based on deep learning is applied to tongue segmentation to help realize accurate,robust,and rapid tongue segmentation.The tongue is positioned before segmentation,which helps reduce the division of the misclassification and leakage points further.The models are combined in a cascading manner,which is flexible and easily combines the model of the tongue positioning stage with other methods to assist in segmentation.Experimental results show that the accuracy of tongue segmentation is effectively improved,and a solid foundation is established for follow-up tongue automatic identification and analysis with the proposed algorithm.