目的 由于舌体与周围组织颜色相似，轮廓模糊，传统的分割方法难以精准分割舌体，本文提出了一种基于两阶段卷积神经网络的舌体分割方法。方法 首先，在粗分割阶段，将卷积层和全连接层相结合构建网络Rsnet，采用区域建议策略得到舌体候选框，从候选框中进一步确定舌体，从而实现对舌体的定位，去除大量的干扰信息；然后，在精分割阶段，将卷积层与反卷积层相结合构建网络Fsnet，对粗分割舌象中的每一个像素点进行分类进而实现精分割；最后，采用形态学相关算法对精分割后的舌体图像进行后续处理，进一步消除噪点和边缘粗糙点。结果 本文构建了包含2764张舌象的数据集，在该数据集上进行五折交叉实验。实验结果表明，本文算法能够取得较为理想的分割结果且具有较快的处理速度。选取了精确度、召回率及F值作为评价标准，与三种常用的传统分割方法相比，在综合指标F值上分别提高了0.58、0.34、0.12，效率上至少提高6倍，与同样基于深度学习思想的MNC算法相比，在F值上提高0.17，效率上提高1.9倍。结论 将基于深度学习的方法应用到舌体分割中，有利于实现舌象的准确、鲁棒、快速分割。在分割之前，先对舌体进行定位，有助于进一步减少分割中的错分与漏分。实验证明，本文算法有效提升了舌体分割的准确性，能够为后续的舌象自动识别和分析奠定坚实的基础。
Two-phase Convolutional Neural Network Design for Tongue Segmentation
Wang Liran,Tang Yiping,Chen Peng,He Xia,Yuan Gongping(School of Information Engineering Zhejiang University of Technology; Zhejiang Enjoyor Research Institute Co., Ltd;School of Information Engineering Zhejiang University of Technology,Hangzhou)
Objective It is difficult to accurately segment the tongue due to the blurred contours and the similar colors of the surrounding tissue. The current tongue segmentation methods, whether based on texture analysis, edge detection or threshold segmentation, is mostly achieved by extracting the color features of the tongue image. That is to say, they are pixel-based segmentation methods. Although the extraction of color features is easy to implement, it is difficult to express the position information of the target. The color between the tongue and the background is similar. The color features of the tongue and the background may overlap. Therefore, the color features of the tongue image is very difficult to express the tongue information. It is necessary to further extract the deep semantic information of the image and obtain more complete features to achieve accurate segmentation of the tongue body. A tongue segmentation method based on two-stage convolutional neural network is proposed in this paper. The cascade method is used to perform the combination of networks, and the output of the previous stage is taken as the input of the next stage. Method Firstly, in the rough segmentation stage, the rough segmentation network (Rsnet) consists of the convolutional layers and the fully connected layers. In order to solve the problem of much interference information in the original tongue image, the region suggestion strategy is adopted to obtain the tongue candidate boxes, and the regions of interest are extracted from the similar background. That is to say, the tongue is located and a large amount of interference information are removed. So the influence of the tissue around the tongue during the segmentation of the tongue is weakened. Secondly, in the fine segmentation phase, the fine segmentation network (Fsnet) consists of the convolutional layers and the deconvolutional layers. The regions of interest obtained in the previous stage are taken as the input of the Fsnet. The Softmax classifier is automatically trained and learned without manual intervention. With the trained Softmax classifier, each pixel of the image is classified to achieve fine segmentation and a more accurate tongue image is obtained. Finally, the designed algorithm performs post-processing on the finely-divided tongue image. The morphology related algorithm is used to deal with the fined segmented tongue image, which can further eliminate the noise and edge roughness. Therefore, the segmentation result is further optimized. In addition, the training of deep convolutional neural network depends on a large number of samples. The collection and labeling of medical images is difficult, so it is difficult to obtain large-scale tongue image data sets. When using small-scale data set to directly train, the network is not easy to converge, and overfitting is easy to occur . It is difficult to achieve the desired results. In the training process, this paper considers three aspects: training strategy, network structure, and data set, so as to avoid overfitting of the models. Result In this paper, a database of 2,764 tongue images is constructed, and the five cross experiment is performed on this database. Experimental results show that the proposed algorithm can achieve better segmentation results and faster processing speed. Accuracy, recall rate and F-measure are selected as the evaluation criteria. Compared with three common traditional segmentation methods, the proposed method increases the comprehensive F-measure by 0.58, 0.34, 0.12 and increases the efficiency by 6 times at least. Compared with the MNC algorithm based on deep learning, the F-measure is increased by 0.17 and the efficiency is increased by 1.9 times. Conclusion The method based on deep learning is applied to tongue segmentation, which is helpful to realize accurate, robust and rapid segmentation of tongue. Position the tongue before segmenting it, which is helpful to further reduce the division of the misclassification and leakage points. The models are combined in a cascading manner, which is flexible and easy to combine the model of the tongue positioning stage with other methods to assist in segmentation. Experimental results show that the accuracy of tongue segmentation is effectively improved and the solid foundation is established for the follow-up tongue automatic identification and analysis with the proposed algorithm.