Font Size: a A A

Research And Implementation Of Text Mining Based On Medical Data

Posted on:2022-09-11Degree:MasterType:Thesis
Country:ChinaCandidate:Z F ShenFull Text:PDF
GTID:2518306785976409Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
With the rapid advancement of Internet information technology,the number of textual materials related to clinical medicine has surged on the Internet.Medical literature records a large number of academic achievements in related research fields,providing a large number of valuable references for scientific researchers engaged in medical research.The knowledge and information recorded in these medical documents are mostly semi-structured and unstructured data formats,which are not conducive for researchers to quickly query the knowledge they want to learn.If traditional manual methods are used to organize the information,it will take up a lot of people's energy;Later,machine learning methods were gradually used for text mining.Due to the limited text representation capabilities of the shallow model,the mining effect was correspondingly limited.Therefore,efficient text mining technology will greatly promote the research in the medical field.In recent years,with the rise of deep neural networks and the improvement of computer computing power,the use of deep learning methods has achieved good results in speech recognition,image processing and text mining.Therefore,this research will take diabetes as the research object,use the method of named entity recognition and relationship extraction in text data mining technology to efficiently identify and extract medical entities and their relationships from diabetes medical text data,and use graph database to extract The medical information was stored,and a visual query system was built at the same time,which successfully completed the structural transformation of unstructured text data.The construction of the diabetes knowledge graph query system can help medical staff or scientific researchers to quickly query and understand diabetes-related medical knowledge,which is of great significance to the prevention,diagnosis and treatment of diabetes,and also provides technical references for text mining research in other fields.The main research contents of this article are as follows:1.Propose the XLNet-Bi LSTM-Attention-CRF model for named entity recognition of diabetes-related medical literature.First,analyze the diabetes-related medical literature from the perspective of text structure and language characteristics,treat the named entity recognition task as a sequence labeling task,and construct a diabetes medical information corpus based on the model training requirements.Secondly,on the basis of Bi LSTM-CRF,a commonly used model for named entity recognition,the pre-training model XLNet is introduced to vectorize text sentences,better combining the semantic information of the context,and fully solving problems such as polysemous words;by introducing the Attention mechanism,So that the model can more fully extract the semantic feature information in the long text training corpus.In this study,through experimental comparison,the results show that the model proposed is better than other standard models in terms of named entity recognition of diabetes medical text.2.Propose the XLNet-Bi GRU-Attention-Text CNN-Softmax model to fully excavate the relationship between medical entities in the diabetes medical text.First input the diabetes text sentence into the XLNet model.The model uses the internal Transformer-xl module and the relative position coding mechanism to encode the text sentence,thereby capturing more comprehensive feature information;then the Bi GRU model is used to extract the contextual feature information and input it to The Text CNN module integrated with the Attention mechanism selectively extracts features;finally,in order to optimize the training of the model and reduce the impact of uneven labeling of relational categories,this study uses label smoothing cross-entropy as the loss function for model tuning.Through comparative experiments with 4 different models,it is verified that the diabetes relationship extraction model proposed in this study can achieve better results.3.Using Spring Boot framework and Vue framework to realize the design and realization of diabetes medical knowledge graph query system.In this study,15 medical entities and 10 medical relationships were identified and extracted to generate a csv file,and at the same time,an appropriate data import method was selected to store the diabetes knowledge in the Neo4 j graph database.Because the graph database has strong capabilities in data storage,retrieval and processing,the Spring Boot framework and Vue framework are used to build a diabetes knowledge graph query system.The successful construction of the system can further help medical workers and scientific researchers quickly Conveniently query medical knowledge related to diabetes and perform visual analysis,which is of great help to the precaution,diagnosis and therapy of diabetes.
Keywords/Search Tags:diabetes medical knowledge map, named entity recognition, relationship extraction, xlnet, neo4j graph database
PDF Full Text Request
Related items