Font Size: a A A

Research On Short Text Sentiment Analysis Based On XLNet Pre-trained Language Model

Posted on:2022-08-05Degree:MasterType:Thesis
Country:ChinaCandidate:S R LiangFull Text:PDF
GTID:2518306521451904Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid popularization and development of the Internet,people’s ways of socializing have been drastically changed.People are willing to communicate and share their attitudes or opinions on certain things on the Internet.These texts that express people’s attitudes are called short comment texts.Sentiment analysis of short texts can provide decision-makers with a certain amount of information reference,in order to carry out more targeted work.Therefore,sentiment analysis research based on short texts is of great significance.Because short texts are mostly limited to 200 words and the colloquialization is serious,short text data has the characteristics of less text characteristics and more noise.There are two challenges when using existing sentiment analysis models to analyze short texts,how to capture more semantic information from a limited context,and how to recognize the accurate emotional polarity of murmurs and polysemous words from the rich language environment.Two sentiment analysis models based on sentiment dictionary and deep learning are analyzed in this paper.The research finds that the dictionary has a single judgment on the polarity of emotional words,and deep learning relies on the word vector model to vectorize the text.The Word2 Vec static word vector model only considers the "window" range information,which makes the calculated semantic information of the word vector incompleted.Since the pre-training language model which was proposed,it has achieved excellent performance in various tasks of natural language processing.It can learn new knowledge from the corpus.What’s more,it is expected to propose solutions to the problems faced by short text sentiment classification.To solve the above problems,the XLNet pre-training model based on the fusion emotion dictionary and the XLNet pre-training model based on the LSTM+Attention network layer are proposed.(1)In the XLNet pre-training model based on the fusion emotion dictionary,the sentiment words are spliced with the original text to increase the proportion of text sentiment polarity.The XLNet model is used to fully learn the contextual semantic information.It can solve the defect that sentiment dictionaries cannot identify the correct polarity of sentiment words in different language environments and expand the application range of sentiment dictionaries.(2)In the XLNet pre-training model based on the LSTM+Attention network layer,it’s network layer is added on the basis of using the XLNet pre-training model to learn the word vector.The network layer can further learn the word vector and improve the special the weight of the word vector.It enables the model to compile more contextual semantic information,and the extracted high-quality word vector semantics have the advantages of richer and more accurate.What’s more,it solves the shortcomings of the static word vector model,and is more suitable for short text sentiment analysis tasks.The emotion analysis task is constructed based on the model proposed in this paper,and Python language is used for construction and verification.The effectiveness of the model is verified through multiple sets of comparative experiments on the public data set,and the performance of the model is evaluated by three indicators: precision,recall and F1 value.The results show that the XLNet pre-training model based on the fusion sentiment dictionary can improve the accuracy and applied range of the sentiment dictionary.The XLNet pre-training model based on the LSTM+Attention network layer can extract higher quality word vectors and obtain the best results in the experiment.
Keywords/Search Tags:sentiment analysis, XLNet pre-training model, sentiment dictionary, Word2vec, deep learning
PDF Full Text Request
Related items