Font Size: a A A

Research And Application Of Projection Position-Based Sequential Pattern Mining Algorithm

Posted on:2013-01-13Degree:MasterType:Thesis
Country:ChinaCandidate:W N WangFull Text:PDF
GTID:2248330374997712Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development and wide application of information technology, Web has become the main channel for interactiving and acquiring of information in the aspect of work, study and life. Web logs collect a lot information of the users access, how to analysis and make use of these mass data becomes a research hotspot in data mining areas. Sequential pattern mining finds frequent sequence with the time characteristics of information, and in recent years. Sequential pattern mining developing fast is widely used in Web log analysis, customers purchase behavior model predictive, disease diagnosis, natural disaster prediction and DNA sequence analysis and so on.In order to solve the problem of the mass information Web logs, this paper delves into sequential pattern mining, discussing the issues related to sequential pattern mining. The main works of this paper are as follows.(1)Firstly, the paper introduces the background and current research at home and abroad of sequential pattern mining, focuses on the most representative sequential pattern mining algorithms at present and analyzes the existing problems.(2)The paper focuses on analyzing PrefixSpan algorithm and finds that the algorithm maybe produces huge amount of projected databases and scans non-frequency items in the process of mining sequence patterns, especially mining dense dataset and long sequence pattern, which will cause decline of the performance of the algorithm. The resource problem can be solved by Projection position-based Sequential Pattern Mining Algorithm so as to reduce time and memory cost. Using UCI common datasets validates the algorithm, analyzes and compares algorithm performance. The experimental results show that compared with other algorithms the proposed PSPM algorithm has better feasibility and scalability.(3) The paper focuses on analysis and finding Web logs Data which have some characteristics, based on the characteristics, we extend PSPM to PSPM_WEB and apply in the Web logs miming to address the personalized information services and building intelligent Web sites, by sequence pattern analysis to find the user’s behavior patterns about Web sites, which can predicte user web page access behavior and construct efficient Web sites structure.As a result, users can easily find information and acquire better social and economic value. Therefore, studing and proposing efficient sequential pattern algorithm, and applying in Web logs mining has certain scientific significance and academic value.
Keywords/Search Tags:Data mining, Sequential pattern, PrefixSpan, Projected position, Web logs mining
PDF Full Text Request
Related items