Font Size: a A A

The Research Of Relation Extraction In Text Mining

Posted on:2014-02-02Degree:MasterType:Thesis
Country:ChinaCandidate:D WangFull Text:PDF
GTID:2248330398470745Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Resent years, with rapid development of science and technology, web data volume grows rapidly. As a part of web data, text data has received more and more attention. In order to deal with the challenges brought by massive text data, and to storage, manage and make full use of text data, we need some automated tools which can quickly find out useful information in the mass information. It is the problem which the research of information extraction is in order to solve.Information extraction is to automatically extract some information from structured or semi-structured data, and save them in a structured way such as a database or an XML document. Information extraction usually contains two inseparable task:named entity recognition and entity relation extraction. This paper is a research of entity relation extraction system based on web data to solve the problem that how to get the relationship between the two named entity. The main contents of this article are as follows:1. According to the characteristics of web data, designs a plan to collect the data for relation extraction. This plan which makes full use of the characteristics of web data and the function of search engine, and combined with the properties of HTML structure, can easily get lots of related web resources with low cost. And this plan also provides a way to extract the text from HTML pages.2. Further study of the mainstream method of relation extraction, and analyzes the advantages and disadvantages of these methods. Propose a novel algorithm of relation extraction based on the study, which take advantage of both the structure of the statement and the characteristics of the word.3. Realized the prototype system of the data collection and relation extraction based on1and2. The system is based on B/S framework, and complete the relation extraction algorithm proposed in this paper. At the same time this system provides a module which can display the extraction result in the browser. And related experiments was carried out with this system, to verify the relation extraction algorithm is effective.
Keywords/Search Tags:Relation Extraction, Named-entity Recognition, Information Extraction
PDF Full Text Request
Related items