Font Size: a A A

Research On Cross-language Sentiment Analysis Method For Chinese And Vietnamese Social Media Text

Posted on:2023-08-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y L ZhaoFull Text:PDF
GTID:2555306797982499Subject:Software engineering
Abstract/Summary:
With the rapid development of the Internet,sentiment analysis is carried out on Chinese and Vietnamese social media data for specific commodities or the same hot event,it is of great research value and application value to be able to grasp the opinions of the people of the two countries on things,such as their opinions and needs,and to grasp the public opinion trends of the two countries,so as to further carry out the tasks of analysis,monitoring and early warning of hotspot events.The cross-lingual sentiment analysis of Chinese and Vietnamese social media texts suffers from the scarcity of Vietnamese annotation data,difficulty in aligning sentiment representation mapping,insufficient learning of commentary features,insufficient use of linguistic knowledge,and inaccurate semantic representation,resulting in a low accuracy rate of cross-lingual sentiment analysis for social media.In view of the above problems,this paper mainly completes the following tasks for Chinese and Vietnamese social media texts:(1)A Chinese-Vietnamese Cross-Language Sentiment Tendency Analysis Method Based on Emotional Semantic ConfrontationDue to language barriers and the lack of Chinese-Vietnamese social media data corpus,it is difficult to obtain high-quality Chinese-Vietnamese social media data annotation corpus,which affects the accuracy of Vietnamese sentiment analysis.In order to support model training,this paper uses a non-template web content extraction method based on the scrapy crawler framework to collect keyword-related social media data from Weibo and twitter through keyword search.According to the different needs of Chinese-Vietnamese sentiment classification tasks in different application scenarios,the corresponding data labeling methods are researched and designed,and a ChineseVietnamese sentiment classification dataset is constructed.And label the data according to the task requirements.The work of this chapter has an important supporting role for the following research points.(2)A Chinese-Vietnamese Cross-Language Sentiment Tendency Analysis Method Based on Emotional Semantic ConfrontationThe task is to analyze sentiment orientation for Vietnamese product reviews.Existing models are difficult to solve the problem of insufficient sentiment representation learning and inaccurate cross-language sentiment representation mapping between Chinese and Vietnamese,resulting in accurate sentiment orientation analysis for low-resource languages such as Vietnamese.rate is lower.Emotional words can strengthen the learning of emotion representation,and adversarial network can reduce language differences.Therefore,we consider to integrate emotion words with comment features,and use the idea of confrontation to reduce the difference between Chinese and Vietnamese emotional features.This paper proposes a cross-language emotional orientation analysis model based on emotional semantic confrontation,which integrates emotional words and comment features,and uses the idea of confrontation to narrow the difference between Chinese and Vietnamese emotional characteristics.Adversarial learning is used to make the model learn the representation with the smallest difference in language distribution,and finally the model classifier is trained through Chinese comment labels to complete the task of sentiment classification.The experimental results show that this model can achieve bilingual sentiment semantic alignment well,and the accuracy of this model is improved by 3.1 percentage points compared with the baseline model,which is a significant improvement,and this method has obvious advantages in language pairs with different differences.(3)A Chinese-Vietnamese Cross-Language Sentiment Tendency Analysis Method Based on Graph Neural NetworkThe purpose of the task is to analyze the sentiment tendency of Vietnamese comments under the same hot event.Social media comments have problems such as diversification of expression forms,weak contextual relations,and insufficient representation.The bilingual text in Chinese and Vietnamese is used to assist in the analysis of Vietnamese comments.understand.At the same time,there is certain syntactic information in the comment text data,and the syntactic information can be used to help the model to further understand its semantic information.Therefore,this paper proposes a cross-lingual sentiment disposition analysis method based on Chinese and Vietnamese textual information and Vietnamese syntactic guidance.Firstly,we use encoders and cross-attention networks to obtain Vietnamese comment representations incorporating Chinese and Vietnamese textual information,and then apply the graph convolution module to model the syntactic information of Vietnamese comments,which improves the model’s understanding of the semantics of Vietnamese comments and thus improves the accuracy of sentiment disposition analysis.The results show that the proposed method improves the accuracy of the benchmark model by 3.7 percentage points,and achieves a significant improvement.(4)Building a Prototype System for Sentiment Tendency Analysis of Social Media Texts for Chinese and Vietnamese.Using the above research results,a prototype system for social media text sentiment analysis for Chinese and Vietnamese is designed and implemented.The system can collect keyword-related social media data from the two social media platforms of Weibo and Twitter according to the keywords given by the user through the crawler technology.The system uses the Chinese-Vietnamese cross-language sentiment analysis model proposed in this paper to analyze and process the collected data,and provide users with the required sentiment analysis of commodities and hot events.The system adopts B/S(browser/server)architecture,integrates data acquisition module,data analysis module and page display module,providing users with a visual information acquisition platform.
Keywords/Search Tags:Chinese-Vietnamese Social Media Text, Cross-language Sentiment Analysis, Graph Convolutional Neural Network, Dependency Syntax
Related items