| Traditional Chinese medicine(TCM)believes that the human body is a unity of qi,form and spirit.TCM doctors identify the name of disease,deduce the type of syndrome,and treat based on syndrome differentiation through the four diagnostic methods of"Observation,Auscultation and Olfaction,Inquiry,Pulse Feeling".Observation is one of the important links in the four diagnostic methods,and tongue diagnosis is the core of Observation.The book Yi Jing recorded that all diseases can be identified from the tongue:the tip of the tongue govern the heart,the middle of the tongue governs the spleen and stomach,the edge of the tongue governs the liver and gallbladder,and the root of tongue govern the kidney,which explains the close relationship between tongue picture and human health and viscera pathological changes.However,tongue diagnosis is closely related to the clinical experience and technical level of doctors,so there is a certain degree of subjectivity in the diagnosis,resulting in misdiagnose.With the rapid development of the new generation of artificial intelligence technology,it is of farreaching significance to create a new mode of data-based and intelligent diagnosis of TCM.In view of this,based on the clinical data of TCM tongue diagnosis,this thesis focuses on some key technologies of intelligent TCM tongue diagnosis to conduct the in-depth research,including tongue image segmentation,tongue feature extraction and classification,multi-modal data combination and diagnosis of non-alcoholic fatty liver disease.The main contents and innovations include the following four aspects:1.For the interference of the lips,teeth and light environment around the tongue image,the problem of poor stability of tongue edge segmentation,in order to make the cut tongue edge more consistent with the physiological curve of human tongue and to protect the important features of tongue edge effectively.In this thesis,we propose an end-to-end tongue segmentation method(TongueIS-UNet)based on the U-Net framework.In constructing the segmentation model of this method,a backbone segmentation network and an edge-aware network are designed using the idea of dense block structure and spatial attention mechanism,and the output features of these two types of networks are operated by cascading,convolution and Sigmoid classification.The experimental results show that the TongueIS-UNet model achieves 97.89%-99.67%of the image segmentation performance evaluation index values such as IOU and Dice.Compared with the benchmark model,the segmented tongue image edges are smoother and more accurate.In addition,in order to solve the shortage of tongue image samples,a multi-label tongue image enhancement generation adversarial method is proposed to realize the enhancement of tongue image under label guidance,which can alleviate the problem of scarcity of partially labeled tongue image samples.The innovation and speciality of TongueIS-UNet method include:It incorporates a tongue edge perception module based on spatial attention mechanism,which improves the smoothness of tongue segmentation edges,and the segmented tongue edges are more in line with the physiological curve of tongue,effectively protecting the important information of tongue edges.2.To address the problem of difficulty in learning the distribution pattern of lingual image features and feature extraction,we propose a target task-driven lingual image feature extraction method(TongueIFE-GAN)based on generative adversarial network.The features of tongue image are extracted by image reconstruction,and the reconstructed images are put back into the discriminator as the generated data for semisupervised training,while the Class Activation Map(CAM)module is incorporated into the discriminator to further optimize the feature processing performance of the encoder and visualize the model output results.The TongueIFE-GAN model can continuously optimize its feature extraction capability under the tasks of tongue segmentation and classification.The experimental results show that the segmentation and classification tasks based on the tongue features extracted by the TongueIFE-GAN model improve the segmentation performance IOU and Dice index values by 0.71%and 1.22%,respectively,and the classification accuracy by 2.1%compared with the baseline model.The innovation and speciality of TongueIS-UNet method are shown:It uses the adversarial idea to let the model self-optimize to learn the distribution characteristics of tongue images.The high quality features extracted by the model can lay the data foundation for downstream work such as tongue segmentation,classification and visualization.Building a novel feature extraction model with adversarial ideas and incorporating a CAM module that can enhance the mechanism of explaining the operation of deep learning models provides a new way of thinking for tongue image feature research.3.To address the problems that deep neural networks are prone to overfitting,poor generalization ability and their poor applicability for mobile devices under the multiclassification task of tongue features,this study proposes a lightweight deep network multi-classification method(TongueIMC-Light)for tongue features by drawing on deep separable convolutional and residual networks.In the model construction of this method,a new activation function is designed to reduce the one-way propagation information activation loss of the deep network,at the end of the model,a classification threshold optimizer is constructed using the Nelder-Mead algorithm and the quadratic weight Kappa function for the multi-level classification requirement of lingual features under the target task.The experimental results show that the TongueIMC-Light model is smaller in size compared with the baseline model.and its classification accuracy is improved by 2.43%to meet the mobile device level application standard.The innovation and speciality of TongueIMC-Light method include:It combines the advantages of deep separable convolution and residual networks,and the model thus constructed is low in size and computational cost.which is suitable for light-weight classification tasks on mobile devices,which is helpful to extend the application of the model.The new activation function and classification threshold optimizer proposed in the method can enhance the applicability of the model for fine classification of lingual images,and also provide reference for research on end-of-range classification of lightweight deep networks.4.To address the problems of difficult feature fusion of multi-source and multimodal data in TCM tongue diagnosis,and the difficulty of characterizing the mapping relationship between features and disease evidence,we propose a model to nonalcoholic fatty liver disease diagnosis(TongueNFLT-MultiMD)based on tongue diagnosis multi-modal data,in order to explore the patterns within and between features,and to deeply analyze the mapping relationship between tongue diagnosis data features and disease mechanism.The thesis introduces a stochastic attention mechanism approach.To this end.this thesis introduces a stochastic attention mechanism approach and adopts a joint architecture model to deeply fuse multimodal data features,and then investigates the formation of a Transformer-based model for diagnosis of diseases and their classification.The experimental results show that TongueNFLT-MultiMD outperforms the baseline model by 4.7%and 5.8%for the diagnosis of NAFLD and the classification of its symptoms,respectively.The innovation and speciality of TongueNFLT-MultiMD model are shown:It makes full use of the tongue clinical data,takes the disease evidence sample as the guide,and aims to explore the disease mechanism,implements the deep fusion of multimodal data at the feature level,and then constructs a deep mapping learning network that correlates the fused features with the disease evidence,which can better reveal the scientific connotation relationship between the tongue image and other data features and the Chinese medicine disease evidence,and effectively improves the model The diagnostic accuracy of the model is effectively improved. |