Entity Relationship Extraction And Mining Based On The Massive Data

Posted on:2013-08-17

Degree:Master

Type:Thesis

Country:China

Candidate:H B Bi

Full Text:PDF

GTID:2248330374483300

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Information extraction technology is a very important technology in information research area. With the development of Internet, how to extract the information that users interested in is an urgent problem and also is an important research direction of information mining. Differ from information retrieval, information extraction need to recognize the named entity in the text, and to extract the relationship between the named entities. In addition, with the flexibility and the complexity of Chinese characters, the recognition of Chinese named entity and the relationship of the Chinese named entity become more and more difficult.Nowadays, there are two main methods of information extraction:one way is based on the rules, and the other way is machine learning algorithm based on the statistics. The accuracy of the first way is higher, but the determination of rules is very difficult, which request the writers to have higher level, also the portability of the algorithm isnâ€™t so good. The second method adopts different models, and use the artificial marked training set to train the classifier, in order to deal with the new data group through computing the probability and to get the final results. Because of the higher portability, better performance and less cost, this method becomes the hot spot of the current research.With the increase of information network, the information extraction of massive data becomes more and more complex. How to use the massive data to extract the key information is a study problem of our paper. Also the calculation of the massive data is a challenging work. The main contributions of our thesis can be covered as follows:â—In the named entity recognition process, we adopt the algorithm based on the maximum entropy model, and use the GIS algorithm to compute the parameters. â—We propose an algorithm based on semantic and SVM, which add the semantic characters into the extraction of the entity relationships, using which to construct the character vectors in order to improve the accuracy of the algorithm.â—Through the analysis of the massive data and the named entity and the named entity relationship recognized, we construct a entity relationship network, and also use the optimization algorithm to achieve the correct results. Based on the final results, we can mine the implicit relationship to get more extensive entity relationship, which is good to grasp the whole information of the mass data.â—We also study the mass data processing platform----Hadoop, design the extraction and mining system of entity relationship of mass data, and check the correctness of the algorithm we proposed in this paper.The entity relationship extraction algorithm based on semantic and SVM we proposed can improve the accuracy of the extracted results and the promotion ability. Although the optimization algorithm of the entity relationship extraction can improve the extraction results of the entity relationship, there also exist the influence of the key words ambiguity, which is one of the main problems to solve in the future work.

Keywords/Search Tags:

Information Extraction, Mass Data, Entity Relationship Network, Implicit Relationship Mining

PDF Full Text Request

Related items

1	Research On Quantization Relation Extraction Method Based On Shallow Analysis
2	Research On Entity Relationship Extraction Method Based On Graph Structure
3	Joint Extraction Of Chinese Entity Relationship Based On Bidirectional Semantic Learning Model
4	Research On Web Entity Activity And Entity Relationship Extraction
5	Research On Entity Relation Extraction Of Aluminum-silicon Alloy Based On Text Mining
6	Research On Entity Relationship Extraction Algorithm Based On Deep Learning
7	Joint Extraction Of Named Entity Recognition And Entity Relationship Based On Neural Network
8	Research On Automatic Extraction Of Enterprise Supply Relationship Based On NLP
9	Research On Entity And Relationship Extraction Based On Weakly Supervised
10	Research On Chinese Entity Relationship Extraction Based On Deep Learning