Font Size: a A A

Emotional Analysis Research And Application Of Data Mining Based On Social Texts

Posted on:2021-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y X ChenFull Text:PDF
GTID:2428330626463485Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Since the rapid development of network and the data business there exists a lot of semi-structured and unstructured data such as pictures and text.These kind of data own the characteristics of great quantity,wide variety and time-effectiveness.How to mine the high-quality information from mass data has become a hot issue.During this period,emotional analysis based on social texts is an effective way on network text data mining.Due to freedom of network speech specially shit-post web-environment,people desire a strong demand for the accuracy and authenticity of information.At present,the development of natural language processing technology is mostly based on product reviews,news reports and other fields.However,in specific areas,the effect of word segmentation algorithm and emotion analysis judgment are still not satisfactory.This paper concentrates on the following three aspects:First,On the basis of TRIE tree word graph scanning technology,this paper apply mutual information and left-right entropy to find word frequency maximum syncopation to identify the new words.This paper compares the effect of several classical statistics in the detection of new words in different databases and finally selects mutual information as internal statistics,left and right information entropy as external statistics.Second,based on python 3.6 scrapy crawlers framework,the data corpus of each network platform are collected and imported to the MongoDB database.This paper combine the mainstream Chinese dictionary to preprocess the data and then use natural language processing technology(manual and semi-automatic dictionary)to establish sentiment dictionary(positive ? negative ? privative).With the help of sentiment dictionary,a new feature selection method called SoA is proposed.Third,this paper improves the bayesian algorithm by means of above three emotion dictionaries on the foundation of classical bayesian algorithm.Moreover,PSO(Particle Swarm Optimizer)is used to recognize and improve the emotional tendency of each text.Compared with a variety of prediction methods(KNN?maximum entropy?SVM),the improved bayesian algorithm based on sentiment dictionary reachs the 86.85% accuracy averagely.The results show that this method can obtain a better prediction result.
Keywords/Search Tags:Data Mining, Mutual Information, Bayesian Probability, Sentiment Analysis, Natural Language Processin
PDF Full Text Request
Related items