Font Size: a A A

Research On The Extraction Of Chinese-Vietnamese Bilingual News Opinion Sentences And Their Sentiment Analysis Methods

Posted on:2021-10-26Degree:MasterType:Thesis
Country:ChinaCandidate:S TangFull Text:PDF
GTID:2518306200953399Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The research on the method of opinion sentence extraction and emotional tendency analysis of Chinese-Vietnamese bilingual news text is the comment on a specific event by Chinese and Vietnamese media.It is intended to help people quickly get the emotional tendency of the whole news from a large number of relevant news documents,so as to understand the different perspectives,attitudes and emotional tendencies of people in different countries when facing the same event.This plays an indicator role in predicting the direction of public opinion and has important research significance.Different from ordinary sentiment analysis methods,this article focuses on the characteristics of Chinese-Vietnamese bilingual news texts and analyzes the role of sentence association in sentiment analysis.Based on the deep learning network framework of double layer coding and bidirectional LSTM,this paper studies the recognition of opinion sentences and the analysis of emotional tendency in news documents.This paper focuses on Chinese and Vietnamese news texts,and mainly completes the following work:(1)The acquisition of Chinese and Vietnamese news texts and the construction of data sets.In this paper,a template-based news data capture method is used to write a corresponding XPath path for a specific news website to obtain the specified news data.According to the task requirements of opinion sentence recognition and sentiment analysis of news documents,select the typical representative events "Sino-US trade war","Belt and Road" and "Sino-Vietnamese defense cooperation" among the 24639 bilingual news texts crawled.Under these three topics,200 documents and 2832 sentences were manually annotated,which were used for opinion sentence extraction and sentiment analysis.(2)This paper proposed a Chinese-Vietnamese bilingual multiple document news opinion sentence recognition model based on the sentence association graph and the recognition of Chinese-Vietnamese news opinion sentences has been realized.Because the general task of opinion sentence recognition,whether single language or cross-language,regards it as a classification task based on the emotional features inside the sentence,and seldom considers the influence of the correlation between different sentences on opinion sentence recognition.Therefore,this paper proposed to construct an undirected graph of the correlation relationship between sentences to represent the correlation characteristics between multilingual and multi-document sentences,and combined the deep learning framework to integrate the sentence coding features and correlation features to realize the classification of opinion sentences.First of all,we extracted the emotional and event elements from Chinese-Vietnamese bilingual sentences by constructing the sentence association graph and using the Text Rank algorithm to get the sentence association features;Then,based on bilingual words embedding and Bi-directional LSTM,Chinese-Vietnamese bilingual news are encoded within the same semantic space;At last,we combined the feature of sentence encoding and the feature of sentence association to recognize the opinion sentence.Experiments show that the method has achieved good results in the task of recognizing multi-document news opinion sentences in Chinese and Vietnamese bilingual.(3)A sentiment tendency analysis model of Chinese-Vietnamese bilingual opinion sentences based on sentence association graph is proposed,which realizes the analysis of sentiment tendency of Chinese and Vietnamese news texts.Aiming at the problem of poor sentiment analysis of Vietnamese news texts,this paper proposes a sentiment analysis method for Chinese-Vietnamese bilingual multi-document news sentiment based on sentence association graph.This method starts by building a double-layered Bi LSTM classification model with Chinese news text and Vietnamese news text input.By mapping two vectors of different spaces to the same vector space and coding different vectors in the same vector space,we can get the relevant information of Chinese Vietnamese news text.At the same time,the Chinese and Vietnamese news texts are used to construct sentence association undirected graph.Finally,the attention mechanism is used to improve the expression ability of feature vector,and emotional tendency is discriminated.The experimental consequent prove that this means can better capture the emotional information of news,and improve the effect of the classification of Vietnamese news sentimentality.(4)Build a prototype system of Chinese and Vietnamese bilingual news opinion sentence extraction and sentiment tendency classification.Using the above research results,a prototype system of Chinese Vietnamese bilingual news text sentiment analysis is designed and implemented.It integrates data acquisition,opinion sentence recognition model and sentiment analysis model to provide users with a visual information acquisition platform.
Keywords/Search Tags:Recognition of opinion sentences in Chinese and Vietnamese news, Sentence-associated undirected graph, Double-layer coding, Cross-lingual sentiment analysis
PDF Full Text Request
Related items