Font Size: a A A

Web Log Data Pre-processing And Multi-dimensional Modeling Study

Posted on:2007-05-28Degree:MasterType:Thesis
Country:ChinaCandidate:X B GuoFull Text:PDF
GTID:2208360182481260Subject:Industrial economy
Abstract/Summary:PDF Full Text Request
The World Wide Web(WWW) continues to grow at astounding rate in both the sheervolume of traffid and the size and complexity of Web sites,Web sites logs Analysis isbecoming important to Web Management. Using results of Web sites logs analysis canimprove Web site design,Web server design,and of navigating through a web site. Thispaper puts forwards the method based on Data Warehouse to analyze Web site logs.In this paper there are three objects to research:Web log dataware house character,datapreparation for Web log data warehouse, mult-dime moding of click stream datawarehouse.In first part I describe Web log file format,problems of log datacollecting,differents of Web log dataware house and Enterprise data warehouse. In secondpart I describe the data praparation of Web log datas and research proxy cachingproblem,data cleaning problem. In data clearning part put forward new arithmetic toimprove effect of Frame Web pages filte. I alse discuss methods to identify users,sessionsand pages.In third part the important is to model multi-dimension of Web logs data warehouse. Inmodeling process the problem of Web sites stream Statistic and patterns of how user tobrowser web site pages. Putting forward three granularities: page abstract,page viewabstract,session abstract. Analyzing the aggregate dimension of session abstract granularity.To make an conformed dimension data warehouse bus architecture and data warehouse busmatrix are recommended.Drawbacks of this paper is described in the next I continue to do some work in the areaof how to use data mining technologies to discovery user usage patterns.
Keywords/Search Tags:dimensional modeling, data praparation, Frame page filtering
PDF Full Text Request
Related items