Font Size: a A A

A Study On The Discovery Of Chinese And Vietnamese Bilingual News

Posted on:2017-03-17Degree:MasterType:Thesis
Country:ChinaCandidate:C ZhouFull Text:PDF
GTID:2278330488964864Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Because of geographical reasons, China and Vietnam have been closely related since ancient times.Timely and accurate grasp the unexpected events of Vietnam can not only help to grasp Vietnam’s opinion,but also provide the basis for the Government on the public opinion monitoring. Bilingual news topics discovery about Chinese and Vietnamese from the mass of Chinese and Vietnamese bilingual news events not only is the discovery and excavation of some hot topic about the two countries,but also is the research focus in the cross-language topic discovery and the integrated use of basic information retrieval research. Based on this, focus on Chinese and Vietnam bilingual news topic found,merging news events links between the elements and the topic sentence of Chinese and Vietnamese bilingual news to constructing the Chinese and Vietnamese Bilingual News graph model, and then use the thought of graph clustering to conduct topic clusters. Specifically mainly have the following characteristics:1. Proposed the extraction method of news topic sentence between Chinese and Vietnames based on the multi-factor of event. First analyzed the HTML page features of Web News reported;then builted a extraction model of news topic sentence between Chinese and Vietnames based on the multi-factor of event; then choosed the vector space model for model and mapped child event sentence and news document to vector space model through more elements model; Finally, calculating the similarity between sub-clause and news events documents by the cosine similarity between vectors, and take the highest similarity for news topic sentence. In the crawl data, we conducted two experiments, by comparing the test results found that the method presented in the Chinese and Vietnamese bilingual news topic sentence extraction more effective.2. Proposed calculation method of similarity between Chinese and Vietnamese bilingual news topic sentence based on multi-feature. Firstly, analyzing the features of Chinese and Vietnamese bilingual news topic sentence,choosing three features and giving different weights for the last three features to calculate its similarity,then get bilingual news topic sentence similarity by adding the three similarity. Test the crawi- ng corpus, compared with the Qin Ying integrated weighting method to prove the feasibility of this method.3. Proposed Chinese-Vietnamese bilingual news topic detection methods based on graph clustering. First, merging the links of the topic sentence and events elements of Chinese and Vietnamese bilingual news to set up a Chinese-Vietnamese bilingual news graph model; and then using a random walk algorithm to get adjacent similarity matrix of Chinese-Vietnamese bilingual news graph model; finally, using adjacent similarity matrix and method of information transfer on the Chinese-Vietnamese bilingual news graph model to Clustering topic. In the crawl data, we conducted three experiments, experiments showed that this method is more effective.
Keywords/Search Tags:Topic sentence, sentence similarity, bilingual, topic, Graph clustering
PDF Full Text Request
Related items