Font Size: a A A

Research Of Log Preprocessing With Task-Constraint In Web Usage Mining

Posted on:2007-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y P ZhengFull Text:PDF
GTID:2178360182973227Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet, it has become more and more important and there are many resources available. But, because of enormous information, it has become more and more difficult to discover useful information to every user. And it is also difficult to learn about rationality of organization structure of web site. WUM is a useful method to find user preference and behavior.By mining the web log,it dicover the pattern of user's behavior.Then in the further step,it analyse the regular of web log to improve the performance and organization structure of web site,and to increase user's efficiency of finding the useful information. Data preprocessing plays an essential role in the process of WUM.It process the raw web log which contains half-baked,redundant and inaccurate records.In this paper,we discuss the key problem and technique involved in the process of log preprocessing.Aiming at solving the probelm of redundancy,our research concentrated on how to preprocess the log without impact on the mining result,and to improve the efficiency of system by shrinking the log file.The paper introduce the concept of constraint aim at dealing with the main problem in preprocessing,and provide corresponding resolvent according to different applications.At last,we check the validity of the resolvent in a log mining system refer to the using evaluation of Electronic Resources.In conclusion,We have done the following jobs: First, we propose a process of Web Usage Mining base on real task.According to the real-time task, we add constraint-based log selection in the data prepare phase. Secondly, in the data preprocessing phase, we propose a session identify method base on URL rewriting. Lastly, we design a web log mining system base on the access log of EI (Ei Village 2) visited by user from Fuzhou University.
Keywords/Search Tags:Usage Mining, Constraint, Log Preprocessing, Session Identification
PDF Full Text Request
Related items