Font Size: a A A

Research And Application Of Mining Technology On Web Log

Posted on:2009-10-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y WuFull Text:PDF
GTID:2178360245975283Subject:Computer technology
Abstract/Summary:PDF Full Text Request
WWW, which is huge, distributing extensive and global, is an information warehouse. It involves news, advertisement, consumption information, finance management, education, government, electric business and many other information services. Web includes abundant and dynamic hyperlink information, information of visit and usage of Web pages. All these offer abundant resource for data mining. The question how to use the enormous data to find the useful information and knowledge is our research issue---Web Log Mining. Web log mining is that use the data being produced when the users are communicating with server to find connotative and disciplinarian knowledge by data mining technology. We can obtain the frequency and behavior model when the users visit the site. Using the frequency and behavior, we can advance the Web site structure and the hyperlink structure between the Web pages, improve the service quality of site, and ameliorate the site performance. In the meantime, we can feed back some doubtful information to the site administrators in time that they may reinforce the site security.This article carries on systemic analysis and research to Web log mining mainly by several aspects below.(1) It narrates the background of the article and the actuality of Web log mining inside and outside, and state on data mining, Web data mining and Web log mining.(2) It analyzes and researches the data pretreatment in Web log mining, analyze the process of conventional data pretreatment in detail and put forward a simple algorithm predigesting the steps of pretreatment. The experiments prove that the algorithms can improve the speed of pretreatment without debasing the precise of pretreatment.(3) The article simply introduces several algorithms used frequently in data mining. To meet the actual mining environment, it studies the frequent-pattern growth algorithms mostly in associational rule algorithms, and brings forward a digitalization method to carry out the frequent-pattern growth algorithms and this advance speeds up mining process.(4) This paper introduces the idiographic implementing process and gives an idiographic example.(5) It sums up the research result of this topic and the flaw staying in work, talks about the research orientation and application foreground of Web log mining and the challenge it confronts by the research experiences.
Keywords/Search Tags:web data mining, web log mining, data preprocessing, associational rules, fp-growth algorithm
PDF Full Text Request
Related items