Font Size: a A A

Design And Implementation Of Big Data Management System Based On Web Content

Posted on:2015-07-18Degree:MasterType:Thesis
Country:ChinaCandidate:J JiangFull Text:PDF
GTID:2298330431487333Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the network, the World Wide Web becomes a large carrier of information, these data bears a lot of valuable information. These data have huge potential for business. Both in China and abroad, big data bring us the potential business opportunities. Industries have their own potential business data for making the largest business profits in the market. Big data is a vertical industry, but how to dig out more conducive for enterprise’s development value from a business point of view and the perspective of big data, rather than just from the technical level of data mining, data analysis. Just mining useful information what I need, and in order to promote the development of the enterprise.In this thesis, according to the potential value of the characteristics, designed and implemented of big data management system the based on web content. The system can use of data mining method effectively, for getting huge amounts of information and extracting useful patterns from the network to convert data into business intelligence, and improve the core competitiveness of the enterprise.This thesis expounds the mass data management system of the project background, project purpose and meanings. Then introduced it related technology, put forward after systems business, functional and non-functional requirements analysis. As to the design of system give the solutions. Where the data mining using web crawlers crawling technology, data analysis stage using clustering algorithm K-means algorithm and the K-mediods of algorithms, association analysis using FP-growth algorithm to achieve. The author’s response is completed the network data capture module, part of the data preprocessing module function and the design and implementation of data mining analysis module for the independent. As major players involved in the completion of the crawl network data, functional data preprocessing and data analysis module implementatioa Finally, this paper analyzed and summarized the result of system test.The project has passed the inner beta, and has already been running. The author participates in the design and implements of each module function running normally, not only meet the demand of users, but also fix on compatibility and extensibility of the various modules.
Keywords/Search Tags:Web Crawler, Data Warehouse, Data Mining, Clustering, Association
PDF Full Text Request
Related items