Font Size: a A A

Research On Short Text Sentiment Analysis Technology Based On Extended Sentiment Dictionary

Posted on:2021-05-05Degree:MasterType:Thesis
Country:ChinaCandidate:X GaoFull Text:PDF
GTID:2438330602986665Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the popularization of the Internet and the improvement of people's living standard,there are more people like buying things online,and online shopping platforms such as Tmall also have developed rapidly,massive online shopping behaviors have brought a large amount of comment text data,and the sentiment orientation of these comments text has important reference value for purchase decision of buyers,adjustment of sales strategy?optimization of publicity focus and market positioning of product of sellers.However,the comment text is a typical short text,and the traditional method of sentiment analysis is not suitable for short text sentiment analysis,the research of sentiment analysis is still in its infancy so far.In order to further improving the effect of sentiment analysis on the short text,this paper takes the short text of Tmall comments as the research object and presents a scheme of short text sentiment analysis based on extended sentiment dictionary.The main contents of this paper include the acquisition and processing of research data,the construction of extended sentiment dictionary,the analysis of sentiment in the commentary notes and the design and development of application system.In terms of data acquisition and processing,the program based on Python has been written to obtain the data for research,and retained the meaningful datas and store them in a specified format.Then,combined the first comment and the additional comment,and marked the emotional polarity of the comment text manually,and then removed the neutral comment with less obvious emotional tendency.Finally,the text preprocessing of the comment text with emotional polarity is carried out,including remove the punctuation,word segmentation,and remove the stopwords,etc.In terms of the construction of extended sentiment dictionary.In order to increasing the accuracy and coverage of sentiment dictionaries,this paper proposes a method to construct extended sentiment dictionary based on the fusion algorithm STSA(Snow NLP TF-IDF Synonym Algorithm).Firstly,it takes the union of set deduplication of union general sentiment dictionary Hownet,the simplified Chinese sentiment dictionary NTUSD of Taiwan University and the commendatory or derogatory sentiment dictionary THUOCL of Tsinghua University as the auxiliary sentiment dictionary;then,the TF-IDF algorithm is integrated into method of canonical long text sentiment analysis based on snow NLP to calculate the sentiment score of sentiment words;finally,we extend the synonym or near-synonym of the sentiment word and calculate the similarity score between the synonym or near-synonym and the corresponding sentiment word.Through the above steps,an extended sentiment dictionary is obtained.In the sentiment analysis of the commentary notes,three comparative experiments are designed to verify the validity of the extended sentiment dictionary.Firstly,the experimental data were acquired by the web crawler,and the data were cleaned,marked and preprocessed manually to obtain the experimental corpus;then,the methods of short text sentiment analysis based on extended sentiment dictionary,snow NLP,and Naive Bayes were used separately to carry out the experiment of short text sentiment analysis,the same test corpus was used to test the effect;finally,the precision rate P,recall rate R and F1 values of the three experiments are calculated separately.We can know from the results of three experiments that the method based on extended sentiment dictionary is superior to the method based on snow NLP and the method based on Naive Bayes in measuring standard value.The research scheme of this paper is reasonable and effective.In the design and development of application system.In order to better apply the research scheme proposed in this paper,I design and develop a short text sentiment analysis system.And,in order to achieve the stability of the data storage and the fast operation of read and write,this article use My SQL to build a database of short text sentiment analysis to store the important data which the system needed to run the system,and the running results produced in the process of the system.This database effectively increased the data read speed and improve the flexibility of data operation and the manageability of data.After testing,the system has realized automated short text sentiment analysis,which can visually display the important result data in the analysis process,and give reasonable purchase advices based on the analysis results.The short text sentiment analysis system can show users more intuitive results and bring a better user experience.
Keywords/Search Tags:Short text sentiment analysis, STSA, Extended sentiment dictionary, snowNLP, TF-IDF
PDF Full Text Request
Related items