Font Size: a A A

One Design And Implemention Of WEB Log Data Mining System

Posted on:2009-09-12Degree:MasterType:Thesis
Country:ChinaCandidate:X X RenFull Text:PDF
GTID:2178360245969532Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As the Internet technology is continuously developing and applied, the web site information will be growing at a rapid speed. How to develop and make good use of the rich information resources provided by web site becomes an issue of general concern. That the web site provides the rich information resources mainly includes two aspects: one is the rich contents of web site by itself and the other is that people on the web site become more conceiving with the volume data access resources of web site, complex data structure and user behavior purpose. Therefore how to use the existing access information to improve the web site performance and server users better is one of the hot topics in computer application field. This paper studies the characteristics of web log on the basis of analying the web access data structure and imports the methods of web data mining. Then we design a certain anyalysis system for our computer science and technology school web site and obtain some valuable analysis conclusions.Web data mining applies the data mining methods to web data extracting useful and innovative patterns from the hiding information or a process of knowledge discovery in database. One of the main brand is web log mining, which involves mining the frequency traversals pattern, user access patterns and user groups information from the huge of web. access historical records to make people fully understand the web site use and user access patterns, thereby optimizing the web site topology and providing better services for users to improve the web site traffic and performance.Based on the above background, this paper puts forward one solution to the web data mining system with the school's web site server log as research materials. With system we obtain not only the basic statistical information of school web site, such as the use of it and server response, but also the user access patterns and the user clustering information. In the point of simplicity, efficiency and practicability, the clustering algorithm is improved, which uses session access sequence similarity metrics as a measurement and the dictionary vector as the storage structure in guarantee of the accuracy of clustering and storage efficiency.First, we introduce the generating background and study on the status at home and abroad. Second, we summarize the web log mining processing model and the procedure of data preprocessing stages. Finally, we discuss the detail design of system and implement the system based on .NET platform with the issue algorithms.
Keywords/Search Tags:web log mining, pattern recognize, frequency traversal patterns, clustering analysis
PDF Full Text Request
Related items