Font Size: a A A

A Study On The Method Of Retrieving Chinese And Vietnamese Bilingual News Events

Posted on:2017-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:G S QinFull Text:PDF
GTID:2278330488464986Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Vietnam is adjacent to Yunnan, China. The communication between China and Vietnam is very closed in environment of the Bridgehead Strategy. Dealing with international relations with Vietnam has an important role in national economic development, political stability and other aspects. Getting event information from the Internet is already an urgent requirement for users by search engines. However, because of the rapid expansion of information on the Internet, the results of general search engines are often large and the query is not accurate. The useful information that users receive is very little after inputting some key words, especially retrieval of event class information. Therefore, the study on the retrieval of bilingual news events between Chinese and Vietnamese has great significance and value. This paper concentrated on the query expansion method orienting event elements and the ranking method of bilingual news events between Chinese and Vietnamese based on graph theory. Mainly to complete the following three aspects:(1) According to the diverse characteristics of framework of news pages on the Internet, the paper proposed the automatic collection method for news data based custom templates. Combining HtmlUnit and Xpath, we customized the data-collecting templates of news pages. We got title, time and text of news pages and completed automatic collection of news data by analyzing news pages.(2) According to the requirement that users want to get more event information, we proposed the query expansion method based on event elements undirected graph. The event elements are divided into common elements and characteristic elements in this method. We extend different elements by the analysis of candidate events and query terms. Firstly, we analyzed the relationship between candidate events and queries and determined the elements to be extended. Then, we constructed undirected graphs using the extracted event elements. We computed edge weights using event vector space. Lastly, we extended event elements using the undirected graph node weight model. By comparison experiments, it is proved that the proposed method has a good effect on query expansion oriented event and helps improve the accuracy of event retrieval.(3) According to the characteristics of event’s attribute and relationship in events, we proposed the ranking method based on attribute association graph. In the event ranking task, the event attribute is the key of ranking. Therefore, we can convert text translation into translation of attribute words for the problem of cross language translation. By doing that, the difficulty of translation is reduced. Meanwhile, there is correlative characteristic in events. The interactions and interplaying between events should be considered in the process of ranking. By comparison experiments, it is proved that ranking effect of the proposed method closes to the accuracy of monolingual methods. Our method has worked well in news events ranking between Chinese and Vietnamese.
Keywords/Search Tags:Chinese, Vietnamese, News Event, Event Elements Expansion, Association Graph
PDF Full Text Request
Related items