Font Size: a A A

Research Of TCM Literature Knowledge Discovery Method Based On Conditional Random Field Model

Posted on:2010-01-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y FanFull Text:PDF
GTID:2178360275473376Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With developments of the computer technology and medical technology, the current medical-related data is growing at an exponential rate. A large number of medical data is recorded in the form of text in a variety of medical literature and stored in the database, such as Traditional Chinese Medicine (TCM) and MEDLINE. How to integrate these resources and find the hidden knowledge have great importance to explain the complex human life phenomenon.Named Entity Recognition (NER) is the most important and also the first step of literature-based knowledge discovery. After analysis and elaboration of related methods of literature-based knowledge discovery in Biomedical Medicine, the concept, methods and models of Named Entity Recognition are introduced, especially two discriminated models: Conditional Random Field (CRF) and Maximum Entropy Markov (MEMM).First, under the premise of big number of tagged corpus in this field, the experiment of gene entities recognition using CRF model is carried out smoothly and the result is satisfactory. CRF is proved to be better than MEMM by the experiment, and used to be the Named Entity Recognition model for Knowledge Discovery of Traditional Chinese medicine Literature. Which lays the foundation for the automatic identification of molecular biology named entity in the integration of TCM text mining.Then, While owing to the lack of tagged corpus in Chinese medicine field, a method using combination of Bubble-bootstrapping and CRF is put forward to solve the constraints. That is proved to be feasible and effective. Also, it can avoid the disadvantages of non-statistical models and other statistical models, and has a good application prospect.
Keywords/Search Tags:Literature-based Knowledge Discovery, Named Entity Recognition, Conditional Random Field (CRF), Maximum Entropy Markov Model (MEMM), Bubble-bootstrapping
PDF Full Text Request
Related items