Font Size: a A A

Research On Data Mining Technology Based On Book Review

Posted on:2018-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:W J HaoFull Text:PDF
GTID:2348330515473777Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the internet,the network information is more and more complicated.In order to get useful information,it is urgent to use the data mining technology to extract and analyze the network text.The book review contains the user's evaluation about the attributes of book product and the purchasing process.Therefore,the focus of this paper is how to extract the valuable information from the book reviews.This paper takes the book reviews of two websites,Amazon and Jingdong,as the data source,studying the data mining technology used in the extraction of the book features and the emotion analysis,thus obtaining the results of the book reviews to help the consumers and the producers make scientific decision.Firstly,this paper analyzes the rules of the webpage and extracts the book reviews in the website to construct the original comment dataset,processes the dataset by using the word segmentation and part-of-speech tagging technology,constructs the deactivation vocabulary to filter the disabled words.Thus the original corpus is formed.Then,this paper deals with comment statements trough the redundant vocabulary,extracts and compares the book features from the comments by Apriori algorithm,FP-Growth algorithm and TF-IDF algorithm.Based on this,this paper improves the FP-Growth algorithm and dig out the book features.Next,this paper constructs the emotional dictionary to identify the viewpoints of the comments,researches and optimizes the SVM feature selection,and carry out the coarse-grained emotion mining of the book reviews.Based on the "two-way judgment" and emotional dictionary to carry on the fine-grained emotional mining of the book reviews,so to get the emotional polarity of a specific book feature.Finally,using visualization technology to display the results of the book reviews,and calculating the matched-degree of user needs and books to help consumers to make purchase decision.In this paper,the main research results are as follows:Firstly,the redundant vocabulary is used to replace the redundant words,which reduces the redundancy of extracting frequent sets.Secondly,the FP-Growth algorithm is improved,and the length of the weight is added to the support degree of the algorithm,and the confidence of the extracted features is sorted,which improves the recall ratio and accuracy of the algorithm.Thirdly,by optimizing the SVM feature selection,the evaluation star is added as a vector feature to the model construction process,which improves the accuracy of emotion analysis.Fourthly,using "two-way judgment" to build emotional relations achieves fine-grained emotional analysis of the book reviews.
Keywords/Search Tags:book reviews, data mining, book features, emotional analysis
PDF Full Text Request
Related items