Font Size: a A A

Research And Implementation Of Emotional Analysis Methods Based On Commodity Reviews In Electronic Products

Posted on:2020-09-21Degree:MasterType:Thesis
Country:ChinaCandidate:J WenFull Text:PDF
GTID:2428330575495171Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
The existing sentiment classification of the product review is divided into positive and negative reviews according to the number of stars labeled by users.However,each user has different control in the star level,resulting in a large error in the product review classification.Product reviews are an important reference for potential consumers to make decisions when purchasing products.When the classification does not match the rating star,it often causes potential customers to question the product itself.In addition,product reviews are an indispensable source of information for merchants to develop sales strategies,improve product performance and service quality.However,a product may have thousands of comments,and manual analysis will consume a lot of manpower and resources.Therefore,it is very important to analyze it through a computer.This article uses the 12,000 mobile phone reviews crawled from JD Mall as a representative of electronic products to classify product reviews.The accuracy of keyword extraction in short texts is low and the difference of initial classifiers in Tri-Train algorithm are uncertain.Implicit confidence filtering increases noise data in the training set.These problems reduce the performance of the classifier.So it is needed to improve the existing keyword extraction technology and Tri-Train algorithm to increase the the performance of the classifier.The main work of the perpar is as follows:(1)Electronic product text comments tend to be colloquial and networked,and new words are used more frequently.However,the ability to use jieba to identify new words is limited,and the result of the wording is not very accurate.To solve the problem,this paper uses mutual information and left and right entropy to obtain new words from the corpus data crawled by JD Mall,and adds these new words to the lyrics,and then uses the new lyrics to segment the words to improve the accuracy of segmenting the words.(2)Aiming at the problem that the existing keyword extraction effect is not good,the advantages and disadvantages of TF-IDF keyword extraction method and emotional dictionary keyword extraction method are analyzed in this paper.This paper proposes a feature selection scheme which combines the method of extracting keywords based on emotional dictionary and TF-IDF to extract keywords,and uses Word2vec to represent text features.The scheme effectively utilizes the advantages of extracting keywords based on emotional dictionary and TF-IDF,and forms complementary advantages.In addition,in order to improve the effect of extracting keywords by using emotional dictionary,this paper constructs a domain dictionary in the field of mobile phone comment using existing mobile phone comment text corpus.Experiments show that the method improves the accuracy and F1 value of the classifier and improves its classification performance.(3)The Tri-Train method is a commonly used semi-supervised algorithm.This method can improve the classification performance of the classifier by using unlabeled data to some extent.However,there are currently two problems:First,the difference of three inatial classifiers are not stable and not large enough.Second,using implicit confidence filtering instead of display confidence filtering,noise data will be introduced,which will affect the performance of the classifier to some extent.To solve the first problem,this paper uses three different algorithms instead of only one supervisory algorithm to train the data after playback sampling to construct three classifiers,so as to increase the difference.To solve the second problem,this paper uses text similarity to filter noise data.Experiments show that this method can improve the classification performance of Tri-Train algorithm to some extent.
Keywords/Search Tags:Tri-Train, Word2vec, TF-IDF, Sentiment dictionary, Sentiment classification
PDF Full Text Request
Related items