Font Size: a A A

Latent Sentiment Polarity Analysis Of Chinese Texts In Social Networks Based On BERT-LCA

Posted on:2022-03-19Degree:MasterType:Thesis
Country:ChinaCandidate:K M ChenFull Text:PDF
GTID:2518306347992799Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Since the 21st century,sentiment analysis has developed into one of the most popular research fields in natural language processing(NLP).Extensive researches of NLP had spread to many outer fields such as political,medicine,finance and even philosophy.It has attracted the attention of the whole society because of its important commercial nature.The reason for this proliferation is that opinion is the center of fact.Almost all human activities,to a considerable extent,are concerned with the opinions of others.Therefore,when a decision is needed for us to make,some constructive opinions are what we are looking for.And in the wildly growing Chinese social networks,it is still a challenge to reveal the hidden emotions behind the words because of the complex Chinese words.Nowadays,there is an impressive growth of Chinese social networks,microblogs have become an efficient way for people to express themselves,and are especially popular among young users.Compared with ordinary Chinese texts,microblogs have a unique style,which not only contain more network words and character expressions,but also have a huge number of words that have "different meanings of the same word",which poses a great challenge for sentiment analysis.Most traditional sentiment analysis methods of microblog texts use sentiment dictionaries to identify the sentiment polarity of microblog texts,using synonyms and antonyms in dictio-naries and the structure level of dictionaries to find the semantic similarity in contexts level and word level.Moreover some positive or negative seed words is introduced,and are used to classify the words sentiment according to the semantic distance,sentiment dictionaries can also be used to analyze the sentiment of words in microblog texts,and obtain and accumulate the sentiment intensity of words,so as to obtain sentiment polarity of the microblog text as a whole.However,dictionary-based sentiment analysis and traditional machine learning methods have many shortcomings.First,in feature extraction,non-emotional stop words can have influence on the evaluation of sentiment level of the Chinese text.Second,due to the wide and deep nature of Chinese language,more parts of speech become one of the important factor that will affect the model performance,a situation where the same word in different contexts can indicate completely opposite sentiment meanings.Thirdly,since sentiment expressions vary greatly in different domains,sentiment classification faces the problem of domain dependency whenever the model is supervised or unsupervised.The recent years witnessed a rapid development of deep learning,more and more researchers have introduced deep learning to the task of microblog text analysis.This study is an application of a deep learning hybrid model to microblog sentiment polarity analysis,which is a new initiative to solve the above problem.This research proposed the microblog text sentiment analysis model BERT-LCA by applying the pretraining language model BERT and LSTM-CNN-Attention(LCA)methods,to investigate the sentiment polarity of microblog texts.the modules included in this model are:(1)obtaining the dynamic feature representation of microblog texts using the BERT pre-trained language model,which makes full use of the word(2)RNN can capture the long-term dependencies and modeling the whole secquence of words;(3)CNN can extract local features and position invariant features well,and fully consider the local feature information and contextual semantic associations in the text,which further improves the model's accuracy in microblog text sentiment(3)CNN can extract local features and location-invariant features,and fully consider local feature information and contextual semantic association in the text,thus further improving the accuracy of the model in microblog text sentiment analysis task.(4)The attention mechanism can extract both global semantic information and local features of the text,and focus on local features according to the category labels to distinguish the importance of local features in the sentence and find the true meaning of words.The BERT-LCA model proposed in this research is applied to publicly available single-tagged Chinese microblogging text datasets,and the experimental results of the F1 metrics outperform the dictionary-based Chinese machine learning sentiment analysis models,and some of the experimental results meet or exceed the native Chinese BERT model,BERT-CNN model and BERT-BiLSTM model used for comparison.It is demonstrated that the model can better extract feature information of text and perform effective classification in the task of sentiment polarity analysis of single-tagged Chinese microblog text.The next work will consider using lightweight Transformer models such as ALBERT to improve the training efficiency and fuse the feature information of each layer of the LCA model,instead of using only attentional and convolutional representations,which is intended to have better performace when introduced to the improved model.
Keywords/Search Tags:Weibo, BERT, LSTM, CNN, Attention Mechanism
PDF Full Text Request
Related items