Design And Implementation Of Disease Analysis System Based On LDA Topic Model

Posted on:2023-09-22

Degree:Master

Type:Thesis

Country:China

Candidate:Y F Zhao

Full Text:PDF

GTID:2544307055459464

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Healthcare is an industry that serves the entire population.With the increasing abundance of medical data,to make full use of medical text data,obtain valuable information from it,and apply it to real life,it is the need of the medical industry to adapt to the development of the times.In this thesis,the topic model is used to conduct in-depth analysis of disease text data,build a disease knowledge base to realize disease question and answer analysis,which will help patients understand the disease according to their own symptoms,assist doctors in making clinical decisions,and provide technical support for analyzing the development trend of disease and self diagnosis.The specific research contents are as follows.(1)Aiming at the problem that the importance of different parts of speech in the disease text data is different,it is proposed to set different contribution weights according to the parts of speech.First,construct the medical professional vocabulary word segmentation dictionary.Then,the disease text data is filtered,Chinese word segmentation,part of speech tagging and stop word removal.Finally,according to the corresponding part of speech,the part-of-speech contribution weight is annotated on the word vector after Global Vectors for Word Representation modeling.Then the disease text vector is calculated.(2)Aiming at the problem that the K-Medoide clustering algorithm has low accuracy in calculating the similarity,the LG&K-Medoide algorithm is proposed.Using Latent Dirichlet Allocation and Glo Ve similarity combined with improved distance function method,the subject clusters of departments were obtained.First,LDA is used to model the disease text,and the Jensen–Shannon distance is used to calculate the text-similarity.Secondly,use Glo Ve modeling to obtain word vectors,label the word vector weights according to the contribution of disease parts of speech,and use cosine distance to calculate the text-similarity weighted based on Glo Ve modeling.Finally,K-Medoide clustering is optimized using the similarity combined with the improved distance formula.(3)Aiming at the problem of a single model of the existing disease analysis system,a disease analysis system based on the LDA topic model is built.First,the demand analysis and frame design of the disease analysis system is carried out.Secondly,build a disease knowledge base containing the relationship between entities such as diseases,symptoms,departments,drugs,and examination methods.Then,set up visual interfaces for disease symptom analysis,department disease analysis and disease question and answer analysis.Finally,extract the symptom text in My SQL database and search the answers in Neo4 j diagram database for analysis and display,to realize the functions of disease analysis and disease question and answer.In summary,the disease text clustering algorithm based on the LDA topic model proposed in this thesis has higher clustering accuracy on the disease text data set.The constructed disease analysis system based on the LDA topic model helps patients to obtain corresponding guidance according to their own symptoms at any time,lays a foundation for the application of topic models in the field of medical analysis,and provides new ideas for autonomous disease diagnosis.

Keywords/Search Tags:

Disease analysis system, Text clustering, LDA topic model, Global vectors for word representation, Text similarity

PDF Full Text Request

Related items

1	Research On Medical Intelligent Question Answering Algorithm Based On LSTM&Topic-CNN Model
2	Research And Application Of Chinese Medicinal Materials Patent Text Mining Method Based On Topic Model
3	Document Clustering Analysis On Semi-supervised-related Medical Literatures
4	Medlinmedline Biomedical Text Clustering
5	Text Mining Of Attitudes Toward Depression On Chinese Social Media
6	Research On Statistical Methods Of New Crown Epidemic Data Based On Text Analysis
7	Structured Processing Of Medical Ultrasound Text Data Based On Semantic Dependency Analysis
8	Research On Recommendation Of Famous TCM Cases Based On Similarity
9	Study On Distance Measure In Text Clustering Of Lung Cancer
10	Research On Logic Rule Augmented Chinese Medicine Entity Representation Learning And Application