Font Size: a A A

Design And Research Of Big Data Mining Based On E-commerce Platform

Posted on:2019-07-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y DingFull Text:PDF
GTID:2428330572463743Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of network technology,network access equipment is booming.Web page technology is becoming more and more mature,which makes the experience of online shopping better and better,thus driving the leap-forward development of e-commerce industry.With the rapid growth of e-commerce platform transaction volume,a large number of transaction data and user's comment data have been accumulated,from which many valuable information can be excavated,such as product defects and user's actual needs.In this regard,this paper studies the large data acquisition and application of comment data in e-commerce platform,extracts the evaluation viewpoints and opinions in product reviews,facilitates the integration to sort out the important context,and combs users' real feelings about the product.In this paper,we focus on the critical data acquisition technology and clustering analysis in the era of big data.We combine Nutch network crawler with Hadoop distributed to achieve the goal of crawling evaluation data through distributed to solve the problem of slow execution of single machine.After filtering and extracting feature words,we use TF-IDF to extract feature words.The method calculates the weight of the feature words so as to construct the vectorized representation of the text.Finally,it achieves the VSM-based method to calculate the similarity between sentences.Then it realizes the distributed operation of Canopy algorithm and K-means algorithm combined with MapReduce framework,which greatly speeds up the efficiency and accuracy of clustering.Finally,taking the comment data of a brand of water purifier as an example,the comment data of the product is crawled out from the e-commerce platform and clustering analysis is carried out.After integrating statistics,the advantages and disadvantages of the water purifier are obtained,and the application of these viewpoints is briefly analyzed.
Keywords/Search Tags:Data mining, comment data, Hadoop, cluster analysis
PDF Full Text Request
Related items