Font Size: a A A

Research On The Extraction And Sentiment Classification Of Chinese-Vietnamese Cross-language Comparable News Opinion Sentences

Posted on:2022-04-22Degree:MasterType:Thesis
Country:ChinaCandidate:Q L SongFull Text:PDF
GTID:2518306524951619Subject:Instrumentation engineering
Abstract/Summary:PDF Full Text Request
Nowadays,with the migration of some labor-intensive industries to Vietnam,the exchanges and cooperation between China and Vietnam in the political and economic fields are getting closer.A timely grasp of the public opinions and opinions expressed by the Vietnamese news media has an important influence on the political and economic exchanges between China and Vietnam.Due to language barriers,it is difficult for Vietnamese news obtained on the Internet to understand and analyze public opinions manually.At the same time,it can be found that the Chinese-Vietnamese comparable corpus has a high degree of content relevance,highly related topics,and partial differences in topic words,which can be shared and used between bilingual corpora.Based on this characteristic,this paper studies the method of extracting opinion sentences and sentiment classification of Chinese-Vietnamese comparable news.With the help of Chinese news comparable to Vietnamese,it can accurately obtain the opinion sentences and emotional expressions of Vietnamese news.This article mainly completed the following research work:(1)A method of extracting opinions and sentences in Han-Vietnam news that incorporates shared topic information.Chinese-Vietnamese Comparable News describes more similar news content and has sharable news topics.The shared topic features extracted from it have an important guiding role for the task of extracting opinion sentences,and can make up for the accuracy of monolingual topic features due to the scarcity of corpus.Low characteristics.First,perform LDA topic modeling on the Chinese-Vietnamese comparable news text corpus,extract the topic features of Vietnamese news and its comparable Chinese news,so that the Vietnamese text can get the corresponding Chinese-Vietnamese shared topic features;then use the bilingual topic vocabulary and bilingualism The emotional dictionary trains the bilingual word embedding model to make the Chinese-Vietnamese bilingual coding in the same semantic space to solve the problem of imbalance in the Chinese-Vietnamese labeled corpus;finally,the topic feature,location feature,and emotional feature are integrated into the word vector to make the Vietnamese sentence sub-semantic information and topic The combination of,emotion,and location information can better recognize opinion sentences.Experimental results show that the integration of shared topic features can effectively improve the accuracy of multi-document opinion sentence recognition,which is effective and advanced.(2)The Chinese-Vietnamese comparable news sentiment classification method incorporating the features of multi-view sentences.In order to prevent the influence of noise when introducing Chinese labeled corpus,it is necessary to filter out Chinese corpora that are close to the semantic expression habits of Vietnamese.At the same time,opinion sentences are an important feature of emotional expression in the article,and weighting the ordering of multiple opinion sentences can introduce the opinion sentence information of the text in a more hierarchical manner to assist emotion classification.Therefore,firstly,high-quality Chinese texts are selected as effective input through the calculation of the relevance of opinion sentences on the Chinese-Vietnamese comparable corpus;the key information of the previously obtained opinion sentences is sorted,and then the multi-point sentence feature matrix is obtained by weighting and integrated into the selective gating network;Use the transformer's self-attention mechanism to pay attention to key information,and finally obtain the news sentiment classification results through softmax.Experiments show that relevance screening and integration of multi-view sentence features can effectively improve the effect of Vietnamese news sentiment classification.
Keywords/Search Tags:Chinese-Vietnam comparable news, shared topic features, Han-Vietnam news opinion sentence extraction, Han-Vietnam news sentiment classification, selective gated network, transformer model
PDF Full Text Request
Related items