Font Size: a A A

Relation Extraction Based On Deep Convolutional Neural Networks

Posted on:2018-05-19Degree:MasterType:Thesis
Country:ChinaCandidate:L Y WangFull Text:PDF
GTID:2348330536466312Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Entity relationship extraction has always been a hot research in the field of natural language processing.Learning to extract semantic relations between two entities from text plays an important role in information extraction,knowledge base,information retrieval and other fields.With the rapid development of deep learning in the field of image and vision,in recent years,deep learning has also been introduced into the field of natural language processing,and has become a hot research.The traditional methods used for relation extraction are primarily focus on designing discrete and effective handcrafted features and their performance strongly depends on the quality of the extracted features.We cannot predict what kind of features are the most effective,and the number of features is not the more the better,mostly rely on expert experience to determine the effectiveness of the features.And the extracted features are often derived from the output of natural language processing tools,which is costly and leads to the issue of error propagation.Compared to the traditional methods,the relation extraction method based on deep learning can automatically learn features from the original sentence without using natural language processing tools.And make full use of the structure of the text information.At the same time,previous research has proved that the convolutional neural networks can learn the features better with its unique network structure.Based on this,a kind of relation extraction method based on deep convolutional neural networks is proposed.Firstly,the TP-ISP(term proportion–inverse sentence proportion)algorithm for measuring the importance of words based on sentence is put forward.The value of tpisp of each word in each category is obtained by this algorithm,and combined with sorting algorithm to get the ranking of the importance of each word.And then select the top of the word in each classes as keywords features.On the basis of the feature of the word embeddings and word position embeddings of the original sentence,the keywords of each classes are introduced,together as the input of the network,which increase category division,reduce the insufficient of existing methods using deep learning methods only depend on word embeddings to learn features and at the same time make up for the deficiency of the network automatically learn features from the original sentence.Finally,in the network training process,the chunk-based max pooling is adopted,select the highest score in each chunk,and combine these features as input of the final classifier,which reduces the traditional max-over-time pooling strategy for the loss of information.In addition,little research has been done in this field,due to the lack of Chinese corpus and other reasons.In view of this situation,combined with the particularity of Chinese corpus,this kind of Chinese relation extraction method based on deep convolutional neural network is proposed by using the COAE(Chinese Opinion Analysis Evaluation)2016 data set.Using the Skip-gram model in word2 vec and Chinese wiki data,the Chinese word embeddings table is pre-trained,better than using word2 vec randomly generated word embeddings table.Experimental results show that the proposed model has greatly improved the results of entity relation extraction in both English and Chinese corpus.
Keywords/Search Tags:relation extraction, deep convolutional neural networks, word embeddings, keywords feature, chunk-based max pooling
PDF Full Text Request
Related items