Font Size: a A A

Research On The Extraction Technology Of Vietnamese News Event Elements Based On Dependency Tree

Posted on:2017-10-01Degree:MasterType:Thesis
Country:ChinaCandidate:J J ZhouFull Text:PDF
GTID:2358330488465713Subject:Computer technology
Abstract/Summary:PDF Full Text Request
There are "Mountain pass vein, water homology" geographical relationship between East and Southeast Asia. Land of our country borders on Vietnam,in addition to the continuous development of Globalization,Our country is more and more closely connected with Vietnam in the political, economic and cultural aspects. It is very important for us to know the news in Vietnam in this case, we can quickly learn about Vietnam's domestic news through the Internet with the development of computer and Internet, but the Internet is full of more and more news,so that we can not find the content that we care about from a large number of news quickly. How to use the information extraction technology to present the unstructured information in a structured form is the problem that we will solve,it has important significance for us to understand politics, economy and culture, other aspects of the news in Vietnam.According to the issue of news event element in Vietnamese,the related research is carried out around the method of press key event topic sentence identification in Vietnamese, topic sentence dependency tree construction, the Vietnamese news event element extraction, the following work were mainly completed:(1) Press key event topic sentence extraction in Vietnamese based on weighted TextRankKey words play an important role in the news event sentence through the analysis of the Vietnamese news document features. Firstly, the news document is pre processed,including word segmentation, part of speech tagging, named entity recognition and stop word filtering, etc.Then calculate the MI value of the key words of the sentence in the news document to determine the event sentence, then build the directed graph for event sentence, introducing sentence position, sentence similarity and keyword coverage to determine the impact of the weight of the sentence,using TextRank model to score each point in the graph; finally, select the top ranked sentences as the key event topic sentence.(2)Construct the Vietnamese press key event topic sentence dependency treeVietnamese and Chinese outside in the grammar are roughly consistent except postpostive attribute through the study of Vietnamese vocabulary and grammar,Vietnamese express meaning through word order, the meaning of the sentence will change if the word order change. Construct Vietnamese press key event corpus and a corpus of corresponding Chinese topic sentences based on the Vietnamese press key event topic sentence extraction on key events,the dependency relation of corresponding Chinese sentence mapped to the Vietnamese sentence by constructing Chinese dependency tree,construct the Vietnamese press key event topic sentence dependency tree.(3) Vietnamese news event element extraction based on dependency treeExtract the trigger word Vietnamese news events and events related elements by combining the grammatical features of Vietnamese and Vietnamese press key event topic sentence construction during the process of extracting event element.(4) Using the above research results,design and implement the system of Vietnamese news elements extraction prototype based on dependency tree.
Keywords/Search Tags:Vietnamese, topic sentence extraction, dependency tree, event element extraction
PDF Full Text Request
Related items