Research On TCM Disease Description Classification Based On Text Semantic Partition

Posted on:2019-11-29

Degree:Master

Type:Thesis

Country:China

Candidate:Z Fu

Full Text:PDF

GTID:2394330548977443

Subject:Computer Science and Technology

Abstract/Summary:

The Traditional Chinese Medicine(TCM)expert system has great significance to solve a series of problems such as the difficult inheritance of Chinese medicine,the lack of TCM resources and the difficulty of seeking TCM medical help.As the most basic and crucial step in TCM expert system,intelligent diagnosis is really worthy to be further researched.In this paper,we convert the problem of intelligent diagnosis to a classification problem of disease texts.First of all,we proposed a method of calculating the similarity of the disease texts based on block vector.We divided the disease texts into blocks according to the different organs they describe,and give different weights to these organs in order to distinguish between primary and secondary symptoms.By this method,we can calculate two block vectors’cosine angle to judge how many common symptoms two disease texts have.Then,we presented a TCM disease text classification model combining with the related technologies of natural language processing and data mining.Finally,we did some experiments based on the disease description text of nephrotic syndrome patients.According to these experiments,we found that the disease text classifincation model which based on block vector has better accuracy comparing with the traditional text classifincation model.The main contributions of this paper are as follows:1)We studied the traditional method of text feature extraction,text similarity calculation,and analyzed the advantages and disadvantages of each method.Then,we applied random forest model and SVM model based on text’s TF-IDF feature to TCM disease text classification.The F1 score of two models are75.38%and75.20%.2)In view of disease text feature,we proposed a text feature extraction method based on block vectors that express the text sementics more precisely.3)Based on the text feature extraction method which we proposed,we presented a new method called SBBV(Similarity Based On Block Vector)to calculate texts’similarity.By comparing with the existing similarity calculation methods according to experiments,we proved that the method we proposed has better accuracy.4)Based on the disease text similarity calculation method we proposed,this paper presented a disease text classification model whose F1 score reaches 90.81%.Finally,we combined the non-textural features with textural features and presented a mixed model for disease text classification.The F1 score of mix model was nearly 1%higher than the model based on purely textural features.

Keywords/Search Tags:

intelligent diagnosis, disease classification, text partition, Doc2vec

Related items

1	Design And Implementation Of Intelligent Diagnosis Guidance System Based On Deep Learning
2	Research On Text Classification Of TCM Nephropathy Based On Key Semantic Information
3	Design And Implementation Of Intelligent Question Answering System For Medical Field
4	Research And Application Of Intelligent Classification Method Of Diseases In Medical Insurance
5	Research On Association Classification Algorithm And Its Application In Diagnosis Of Coronary Heart Disease
6	Research On Medical Images Classification For Diagnosis And Retrieval-based Text Generation
7	Research Of Intelligent Hepatopathy Auxiliary Diagnosis System Based On Text Semantic Analysis Of Electronic Medical Records
8	Research On ECG Intelligent Diagnosis Technology Based On Multi-label Classification
9	Design And Implementation Of Traditional Chinese And Western Medicine Diagnosis System Based On OCR
10	Research On Medical Intelligent Question Answering Algorithm Based On LSTM&Topic-CNN Model