With the development of economy and Internet,e-commerce industry gradually emerges,and online shopping mode becomes mature.Different from traditional on site shopping malls,where consumers can only rely on product reviews to obtain real information about product quality or service quality then decide whether to buy or not.Nowadays,as consumers pay more and more attention to product reviews,some businesses start to hire click farm to boom the sales of their product,also give a lot of false praise to it.False reviews tend to mislead consumers in a way that not only damages their interests but also undermines the fair market competition rules.How to identify false reviews has become a burning issue in recent years.It is very difficult for ordinary consumers to identify fake comments by glimpse because of the huge amount of data and hidden nature.Therefore,this paper studied the false review identification problem based on machine learning algorithms,using traditional machine learning models and neural network models in deep learning to solve this problem respectively.This article mainly worked from the following three aspects.Firstly,the review data of a hot-selling commodity was viewed by Python on JD.COM platform.The evaluation data set was constructed by annotating the evaluation data according to the characteristics of the click farm,the review time and the expert guidance.Secondly,The Chinese data is processed by natural language,then TF-IDF model was used to transform the text into vectors and extract features.Four models including logistic regression model,Naive Bayes model,support vector machine model and random forest were used to identify false comments.It is found that Support Vector Machine model has the best recognition effect.Thirdly,the word embedding notation of text data was performed to convert text into vectors,and used the TextCNN model and the LSTM model to identify fake comments.From the perspective of loss function and evaluation index,the LSTM model has the best recognition effect.Finally,The classification effect of support vector machine and the LSTM model was verified on Taobao platform data set.Combined with the training time of the two models,it is found that LSTM model has more practical significance for false comment recognition. |