Font Size: a A A

Research On Event Location Extraction Of News

Posted on:2019-07-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y FangFull Text:PDF
GTID:2428330566998116Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Locations extracted from news event can be helpful for tracing the source of public opinion,retrieving information,etc.For now,there is only few works concentrating on event location extraction,most researchers focus on the extraction of locations and news event.Named entity recognition cannot tell which location is the event location,and news event extraction cannot make sure that all the elements can be extracted.Therefore,to extract the event location,we utilize many features with two extraction methods to complete the task of event location extraction.The workflow of our approach is as follows:(1)preprocess the news;(2)extract all the locations and people from the news;(3)label the news manually and then convert the labels into machine recognized labels;(4)construct knowledge graph;(5)compute the feature vector for each word;(6)train and evaluate the model.Knowledge graph,which is constructed with relation extraction model,is used to find the inclusion relation of locations and build location forest.In this work,we explain the event location extraction task from two different perspectives.First,we view the task as a binary classification problem,that is,each location in the article can be classified into two categories,location or event location.Secondly,we view the task as a sequence labeling problem,that is,we predict the label of each work and look for the event location.For the binary classification problem,we choose random forest as our classification model.We extract all locations from the news and build a feature vector for each location.Then,all the feature vectors are fed into the binary classification model to train the model.For the sequence labeling problem,we choose the LSTM model as our predict model.We first combine the word vector and feature vector for each word,and then feed all the vectors into the LSTM model.The trained LSTM model can output each word's label.The experiments results show that,on the whole,the binary classification model is better than LSTM model.The accuracy of binary classification model,which is 93.1%,is slightly lower than LSTM;but its F1 score,which is 93.4%,is higher than LSTM.Through the experiments,we can see that our approach is slightly better than recent researches.
Keywords/Search Tags:event location, random forest, LSTM, knowledge graph
PDF Full Text Request
Related items