Font Size: a A A

A Method Of Knowledge Extraction And Association From Unstructured Texts

Posted on:2011-04-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y YangFull Text:PDF
GTID:2178360302974606Subject:Computer applications
Abstract/Summary:PDF Full Text Request
With the rapid development of computer technology and Internet, digital library, for its convenience and easy accessibility, is replacing the traditional library as a center of data and knowledge. However, with the expansion of digital library resources which are mostly unstructured, it is difficult to directly access and use the contents and data of these texts, not to mention the knowledge in these texts. Therefore, automatic acquisition of knowledge has become a research hotspot, and knowledge extraction and association techniques based on different principle is highly desirable in the field of text mining.This paper mainly studies knowledge extraction and association techniques from unstructured texts of Chinese Medicine books. Here, we also design and implement an information management system called Chinese Medicine system based on the techniques we discussed in the paper, which is a support to digital library for accessing literature of Chinese Medicine.Knowledge extraction in this paper is completed by the approach based on support vector machine (SVM). Firstly, we design a Chinese Medicine concept model which contains all content bodies. Secondly, set each content body a SVM in charge of content recognition. Then, set the elements of each feature vector of SVM according to features of text layout, words and contents, etc. Finally, combine all extraction patterns of content bodies to form a concept extraction pattern. The method we raise here solves the problem of how to extract structured information from unstructured texts.Knowledge association in this paper is focus on mining relations between each medical knowledge points of Chinese Medicine. Direct relations like classification which indicates characteristic in common are found through the hierarchy structure in directory files. Indirect relations between the knowledge points are found by string matching or text similarity comparing, and the potential relations between prescriptions are found by text clustering.The method of knowledge extraction and association techniques we discussed in this paper can finish the work of extracting Chinese Medicine knowledge from unstructured texts and associating them. A Chinese Medicine system, which is implemented based on those techniques mentioned in this paper, providing users a wide range of information services.
Keywords/Search Tags:Digital Library, Knowledge Extraction, Knowledge Association, Support Vector Machine, Clustering
PDF Full Text Request
Related items