Font Size: a A A

TCM Disease Classification Model And Application Based On Prompt Learning And Optimization Of Long Tail Distribution

Posted on:2024-05-15Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y LiuFull Text:PDF
GTID:2544307112976589Subject:Electronic information
Abstract/Summary:PDF Full Text Request
In the prevention and control of the new crown epidemic,traditional Chinese medicine has played an active role.This paper studies the classification of TCM diseases.Compared with Bi-LSTM and TextCNN,the classification effect of the Bert model is better.However,in scenarios where samples are scarce and data distribution is uneven,the performance of the Bert model still has obvious room for improvement by only using the fine-tune method.Aiming at the problem of lack of samples and unbalanced data distribution,the Prompt_Trig_Bert text classification model is proposed by using the method of manually designing templates based on prompt learning and improving the loss function,which adapts to the characteristics of TCM disease classification tasks and effectively improves the classification performance of the model.The main research contents and innovations of this paper are as follows:(1)In view of the lack of samples,language knowledge is learned through pre-training language models,and prompt information is given during the process to guide the model to fine-tune.The method of hint learning is used to fine-tune the Bert pre-trained language model to improve the small-sample learning ability and robustness of the model.When building a prompt,try three different hand-designed templates: prefix mask(Prefix_MASK),postfix mask(Postfix_MASK),and trigger word mask(Trigger_MASK).Through experiments,the optimal manual design template--Trigger_MASK was selected.(2)For the problem of unbalanced data distribution(that is,long-tail distribution),since the Bert fine-tune model tends to predict more common categories with a larger number of samples,it is less effective in classifying fewer and rarer categories,while Dice Loss can Reduce the weight of more and more common categories in the loss function,so that the loss function tends to be less and more rare categories,so the improved Dice Loss is used to replace CE Loss.(3)Due to the incorporation of prior knowledge of disease classification in the process of constructing the template,the experimental effect of the Trigger_MASK template is better than that of the other two artificially designed templates.On the basis of selecting the best template Trigger_MASK algorithm,the improved Dice Loss function is incorporated,and then proposed Prompt_Trig_Bert text classification model based on prompt learning artificially designed template(Trigger_MASK)and improved Dice Loss loss function.The experimental results show that the experimental effect of the Prompt_Trig_Bert model is better than that of the Bert fine-tune model(F1-socre increased by 3.1%,the Precision value increased by 3%,and the Recall value increased by 3.2%),and it can predict rare categories(gynecology and surgery)in the case of small samples.The learning ability of is improved,which in turn improves the classification performance.(4)Finally,Using the Prompt_Trig_Bert model to develop a TCM disease classification system,and applied the system to a self-designed TCM intelligent diagnosis and treatment device.In the process of software and hardware research and development,this model obtained a number of national patents,won a series of national innovation and entrepreneurship competition awards such as the National Bronze Award in the "Internet +" Competition(Project Name: Mulinsen-Innovative Medical Services,Improving the Efficacy of Traditional Chinese Medicine),and the First Prize in the National 3D Digital Innovation Design Competition(project name:community-based self-serviced traditional Chinese medicine intelligent crushing and decoction machine).
Keywords/Search Tags:Text classification, Bert model, Prompt learning, Long-taildistri bu-tion, Traditional Chinese medicine disease classification system
PDF Full Text Request
Related items