Font Size: a A A

Automatic Diagnostician Based On Chinese Text Classification

Posted on:2004-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:D WangFull Text:PDF
GTID:2168360095460735Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
This paper has realized an automatic diagnostician, which based on text classification. Medical records have annotated the relationship between disease phenomenon and category. The classifier, which includes knowledge between the disease phenomenon and category, can be trained from annotated texts by machine learning technology. And then, it predicts the sort of disease by analysis on disease phenomenon. At last, automatic diagnostician is realized using our methods.Large-scale texts are useful for the information processing about medical area. Recently using natural language processing technology to process electronic patient's record is becoming a hot spot on research and application in information processing area. It is benefit to forecast the distribution of patient's and the developing tend of various diseases. It is an efficient way to improve the level and efficiency of remedy. Therefore using natural language processing technology to process medical information has theoretic meaning and practical worth. In order to build the automatic diagnostician system, the key problems, we need to solve, are the organization of electronic patient's records and word segmentation and building classifier.First, we organize patients' records for our system. In fact, it is a course of gathering training data. This system uses cured patient's medical record as original data. Electronic patient's record includes symptom, diagnosis and curing circumstances of diseases. The quality of the training data concerns the realization of automatic diagnostician system. By pretreatment, it is preserved on the form of easy to process. This paper constructs a subsystem to store and manage patient's records accurately and effectively.Second, we realize the automatic word segmentation on Chinese texts. Word is the smallest processing unit in understanding natural language. Chinese segmentation is the first step of any Chinese natural language processing system. It is veryimportant. Only exceeding this obstacle, the processing system can be called having initially "intellect". This paper introduces many technology of segmentation, such as maximum matching, improved maximum matching, full segmentation, and so on. This system uses the integrated method for segmentation and part-of-speech. The experiments show the performance of our method is better than others.At last, we build Bayes classifier to learn knowledge from the records. The classifier realizes the automatic diagnostician. This method is general and effective. And, the performance will be improved by enlargement of training data. Such as use vector space model(VSM) to improve it. Bayes classifier is the simplest one of these models. The parameters of the model are trained in the data we gathered under independent assume. The most likely category of the new example is selected with Bayes model rule. This paper builds an automatic diagnostician based on statistical methods; it has many merits comparing with traditional rule-based expert system in medical field. It solves the difficult problem on knowledge acquirement. The knowledge is learned from real medical records, so it is good and objective. The primary experiments indicate that doctors can get useful information form automatic diagnosing candidates, and it is helpful to improve the efficiency of doctors. The system still possesses stronger transplant property, and can expand to other domain. It is signification to explore the system in the future work.
Keywords/Search Tags:automatic diagnostician, text classification, Bayes algorithm, Chinese word segmentation
PDF Full Text Request
Related items