Font Size: a A A

For The Network User Behavior Patterns Found In Data Mining Techniques To Explore

Posted on:2011-01-17Degree:MasterType:Thesis
Country:ChinaCandidate:T F LiFull Text:PDF
GTID:2208360308466990Subject:Software engineering
Abstract/Summary:PDF Full Text Request
After the last ten years of 20th Century, with the rapid development of information technology, Internet plays a more and more important role in people's daily life. While giving us a vast wealth of information resources, the Internet also brings great business opportunities. A variety of Internet services and e-commerce activities has made tremendous progress. For the reason that the needs of customers are business opportunities, how to find out the rules and modes of user access from the access information on web servers which left by web users, become one of the most important themes for e-commerce service providers and data mining researchers. Then the research of web log mining has begun.The main research content of this thesis is the main theory and algorithm improvement of any critical procedures in the web log mining process, including the algorithm test which is operated on real data. The most important two processes in web log mining are data pretreatment and pattern discovery. In this thesis, we first analyze each procedure of data pretreatment process. While making summary of principle for each procedure, we propose the corresponding processing mechanism and algorithm implement. While emphasizing on the method of user session identification, in order to deal with the problem that the traditional method could not adapt to different web access, we proposed a new user session identification algorithm and then verified it by experiment. In the process of pattern discovery, we analyzed the mode of user access by the method of sequential patterns discovery. In this thesis, while analyzing the procedure of frequent sequence discovery, we emphasize on the improved algorithm which came from Apriori algorithm which is widely used in association rules mining. Based on the analyzing of this algorithm, we point the defect of this algorithm and propose a new method to solve such problem. We studied the mechanism of constructing the factors which are used to modify the original algorithm, and then propose our new algorithm and verify it in experiments.
Keywords/Search Tags:Web log mining, Data pretreatment, Sequential patterns discovery, User session identification
PDF Full Text Request
Related items