The Research Of Frequent Pattern Algorithm Based On Web Log Mining

Posted on:2012-06-16

Degree:Master

Type:Thesis

Country:China

Candidate:J Feng

Full Text:PDF

GTID:2218330338470698

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

With the rapid development of Internet and rapidly growing in popularity, Web sites has become the main platform to manufacture, release, handle and process the data information for people; at the same time, web sites'structures have become more and more complex, and the quantity of data is also rapid expansion on the web. How to mine the potential, useful knowledge information, and use this knowledge to improve the web site structure, eventually, it gives people better services and make the web site owners get more profit, etc. For these problems which have been paid attention by the web site owners, in web domain, the traditional data mining theory and technology has been introduced by mining Web logs to get useful information and patterns, nowadays, providing personal services for Web users, business intelligence, improving system performance and optimizing Web sites, data mining method and technology has been applied in these domain. Now, web log mining based on clients' web log data has been paid more attention by many researchers.In thesis, about data mining theory and integrated processes of Web log mining are introduced in detail, and some innovations and improvement solutions for these problems of Web log mining are put forward.In the fist place, the research meaning, background, data source and the whole procedures of Web logs data preprocessing are systematically introduced in thesis. Some related knowledge and the solutions to the problem for clients'web log are mainly discussed, and then introduce the characteristics of clients'web log and the difference between the server log and client log.In the second place, an improved method is proposed after analyzing the shortages of current calculating interest-level methods on web pages, which is based on the clients' web log data and calculating the real browsing time. By analyzing improved method is more reasonable and truer to reflect the users'interest-level on web pages. Next step is to analyze the direct structure graph, and the interest-level value is viewed as the weighted value of graph mentioned before, and these weighted values are assigned to the corresponding node of graph, at last, produce the weighted directed graph.Lastly, we can mine user's frequent patterns based on weighted directed graph and user access transactions database, now the new improved algorithm—GTWF algorithm, and this algorithm is used to mine the users'access frequent pattern. In this algorithm, can realize to mine graph by pruning operation and producing operation according to some concepts such as weighted support degree, extensible mode and weighted frequent patterns, eventually, the experiments on performance of the algorithm was verified. This algorithm was verified by doing experiments.

Keywords/Search Tags:

Web log mining, data preprocessing, client log date, interest-level of page, weighted frequent access pattern

PDF Full Text Request

Related items

1	Research On Technique Of Web Mining Based On Log
2	A Study On Algorithms Of Weighted Frequent Pattern Mining
3	Research On Website Optimization Strategy Based On Frequent Pattern Mining
4	The Research And Implement Of Algorithm On Web Usage Mining
5	A Study On Weighted Frequent Pattern Mining Algorithms
6	Study Of The Access Pattern On Web Page
7	Research On The Mining Algorithm Based On Data Streams
8	Research On Frequent Pattern Mining Methods For Large-scale Date Stream
9	The Research And Relization Of Mining Frequent Patterns On Business Data Straems
10	The Research On The Related Problems Of Association Rule Mining