Font Size: a A A

Research Of Data Mining Based On Web Log

Posted on:2004-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:H S TianFull Text:PDF
GTID:2168360092986287Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Internet has developing with incredible speed for several years, in rencent years, more and more institutions, groups and individuals issuance and lookup information in the Internet. There is a mass of information in the Internet, but Web is unstructured and dynamic, and the composition of Web page is more complicated than text archive, so looking for data which someone want in the Internet is such difficult as looking for a needle in a bottle of hay. The website can't c luster it's users and web pages, so i t can't provide special service for a given people. Besides, the organization of websites' content may be quite different from the organization expected by visitors to the website. What's more, thers are some peculiar users whose hardware resource is finite, they use palmtop (such as Palm Pilots,Pocket PC,Handspring etc.) browse web page, then how to prefetch web page for them is worth to research.How to resolve these problems? Web mining which combine classical data mining technology with web is an appropriate approach. Web mining is a process that extracting some interesting and latent useful pattern and recondite information from web archives and web activitys. Web mining can react on several fields such as search engining structure's miningx confirm authoritative web page, classifing web archives,classifying web log, intelligent query etc.The thesis intruoduce the definition, mission, classification of web mining as well as the model and process of it at first.Then, a data structure and the corresponding arithmetic which suit to web mining are bring forward. The data structure is a User_URL martrix, it show the information that use access webpage. Mining arithmetic which utilize matrix cluster will cluster user, webpage and identity the frequent path as well as predict access.In the end, make a summarize of disadvantage which exists in the thesis,at same time, point out the direction, future and challenge of the web mining.The result of experiments show that the arithmetic which is applied to campus net's web log is efficient. In addition, applying the arithmetic to e-business website will construct an adaptive website, this will provide personal service to a special user, finally, this will provide trader powerful support to decision.
Keywords/Search Tags:data mining, web mining, user clustering, web page clustering, frequent access paths, web page prefetching
PDF Full Text Request
Related items