Font Size: a A A

Data Mining Based On Web Log

Posted on:2007-08-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y H JiangFull Text:PDF
GTID:2178360212979960Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The World Wide Web is a distributed global information resource containing a large amount of data relevant to essentially all domains of human activity. It is become a question to pay attention how to develop and use the rich resource. Therefore, it constitute a active research field at present that data mining technology combine with network application research i.e. web data mining technology. Web log mining is a branch of web usage mining and has special theory and practice significance as an important part of web mining.In this thesis, the process of data mining, web data mining and web log mining was reported. Focusing on the web log mining. It discovers that user visites web page patern by web log record mining. Further, it analysis and studies the order of web log record so that to improve the characteristic and organizational structure of websits, to improve the quality and efficiency that user researches information, to find the relation between special user and special area, time, page by statistical and association analysis.The object of Datapreprocessing is data that is contained in initial Web log files. Those half-baked, redundant, inaccurate data need to process. In this thesis, the key technology about datapreprocessing is studied and discussed. The patern analysis and patern expression in web log mining is studies the web browse behavior of user, so that to understand the interest that visitor browse. They are important tache to enhance web quality and to improve websits structure design. In this thesis, the method of patern mining were discussed. It studies the algorithms of association rule and sequential patern mining. It bring forward that the fp_tree anf prefixspan apply to association rule and sequential patern mining bases on the algorithms compare. Finally, the technology of web log mining was applied to Tianjin railway engineering school web station (http://www. tjtdxy.cn). Through the mining of its web sever log files,a data mining system based on web log mining was establish. The established data mining system will facilitate station management, the improvement of the design of web station.
Keywords/Search Tags:Data Mining, Web Data Mining, Web Log Mining, association rule, sequential patern, patern analysis
PDF Full Text Request
Related items