Font Size: a A A

Data Mining Based On Web Information

Posted on:2015-01-23Degree:MasterType:Thesis
Country:ChinaCandidate:L F LiFull Text:PDF
GTID:2348330485994353Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of the Internet, research and application of Web data mining has become indispensable. Web data mining can be divided into Web structure mining, Web content mining and Web usage mining. After an in-depth study of the basic concepts and characteristics of three types of Web data mining, it does a lot of work about the research status of Web data mining today, and some of the typical findings are described in detail. Specially, this paper only does specific research and practice with Web content mining and Web usage mining.In this paper, after an in-depth understanding of the basis of data mining techniques and algorithms, it briefly describes the application and research status of crawler technology, database technology, clustering algorithms, visualization technology and other technologies. And how to apply these techniques into specific practice in this paper is pointed out.This paper gives a general Web content mining methods and procedures. And taking the bus news site as an example of the practice, paper complete the bus news information mining system. By the methods of crawler, extract new content, extract content keywords, storage, visualization, the system solves the issues that the bus lines are changed real-time. The system has been applied into practice, with good results and significance.This paper designs versatility mining framework for the Web usage mining. Taking user logs of electronic security system as an example, it completes the user log mining system. Three algorithms(purchases, delay, Simple-Kmeans) are proposed to solve three questions which are user purchases, server latency, user classification. It does specific experiments to verify the effectiveness of the algorithms. Lastly, the system compares the Simple-Kmeans results with the experimental K-means results to explain the advantages and disadvantages of the algorithm.Finally, the paper gives the experimental results and notes, summarizes the studies of Web data mining, make prospects for its development.
Keywords/Search Tags:Data Mining, Web Content Mining, Web usage Mining, K-means, Crawler
PDF Full Text Request
Related items