Font Size: a A A

Design And Implementation Of Mining System Based On Online Product Review

Posted on:2013-01-16Degree:MasterType:Thesis
Country:ChinaCandidate:Q F FuFull Text:PDF
GTID:2248330362465354Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Based on the characteristics of dynamic websites, this paper designed aninformation extraction system based on DOM structure of pages. The system couldaccurately cluster web pages, generate the wrapper, effectively extract the data frompages and save them as structural data.With the development of e-commerce, shopping online becomes more and morepopular. However, facing a rich supply of different commodities, It is hard to pick upa satisfactory one. There are many evaluation of goods provided, but seldom toclassify and analyze in order to mining more valuable things. According to theproblems above, we analyze the assessment information of cellphones, and acellphones’ evaluation information based mining system is proposed, it brings moreuseful information for merchants and consumers.This paper has researched the following areas.Firstly, we preprocess the text, delete the meaningless words, and segment therest words for the sake of improving the accuracy of text classification.Secondly, we build a thesaurus for classification. In order to make sure the highaccuracy of classification, we set up a product features corpus, give the weight to theemotional polarity word dictionary, and weight the feature words via TF~*IDF. Weclassify the assessment information into good and bad two category using NaiveBayes algorithm.At last,we analyze the experiments of the mining system proposed, and theresult demonstrates that the system achieve high accuracy and recall rate, and thesystem is rational and effective.
Keywords/Search Tags:Assessment mining, TF*IDF, Text analysis, Naive Bayes algorithm
PDF Full Text Request
Related items