Font Size: a A A

The Study On Knowledge Extraction From Text Resources

Posted on:2011-09-29Degree:MasterType:Thesis
Country:ChinaCandidate:S KongFull Text:PDF
GTID:2178330332960839Subject:Information management and e-government
Abstract/Summary:PDF Full Text Request
With the widely development of information technology and internet, information resources is growing very quickly. And 80% of the information resource is stored in the form of natural language text. How to get the knowledge from the text data and how to solve the contradiction between flood of information and lack of knowledge is the goal of knowledge extraction. And natural language processing is the key technology to solve this problem.First, this paper gives out the background and research status of the topic on knowledge extraction from text resources. Therefore, we can know that the research object is unstructured tree text and the study goal is to extract knowledge, involving natural language processing, text mining and other related fields. After analyzing and summarizing the related knowledge extraction system at home and abroad, we present the history and development trends of this field. Second, we summarized the related key technology to provide the theoretical basis to this paper, including natural language processing, Chinese word segmentation, the semantic similarity algorithm and commonly used dictionary. Third, we proposed text knowledge extraction models, including the definition of the concept of text knowledge, analysis of the text structure, transformation of web html to plain text, implement of key word extraction and the topic sentence extraction. Finally, we design and implementation a knowledge extraction from text resources system to validate the text knowledge models.Overall, there is not much work of knowledge extraction in China. But the relevant research has developed well, such as information extraction, knowledge discovery, ontology and so on. Different from the traditional information extraction based on rule and learning mechanism, this paper aims to develop a knowledge extraction system that tries using NLP to extract knowledge for scientific literature of the discourse after word segmentation, POS tagging, syntactic analysis, and semantic analysis process. This study can be a kind of technology solutions to solve the problem of the contradiction between flood of information and lack of knowledge.
Keywords/Search Tags:Natural Language Processing, Knowledge Extraction, Text Resources
PDF Full Text Request
Related items