Font Size: a A A

Research On Association Rule Mining Algorithm Based On User Behavior Analysis

Posted on:2018-12-21Degree:MasterType:Thesis
Country:ChinaCandidate:B CaoFull Text:PDF
GTID:2358330518959694Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the popularity of Mobile Intelligence,Social Network Services are becoming a main medium for information acquisition,information dissemination and entertainment in human life.As the explosive growth of the user behavior data,the features such as multi-source,high-dimension and sparsity make it extremely complex and more challenging to user behavior analysis.However,the rules hidden in data set can illustrate the laws of social network structure evolvement and communication.How to deal with the massive user behavior data sets effectively and obtain the potential rules by distributed computing platform has become a great potential for research.As the growth of the data,the transfer of user preferences and interest,most association rules will change dynamically over time.In order to describe the temporal rules in the user behavior data set and solve the problems such as high computational complexity,large I/O superload,and data storage exist in frequent scanning database of traditional user association analysis method,we do some research work as follows:(1)We summarize the principle and algorithm process of user behavior analysis and Association Rule Mining methods.In the meantime,we introduce the architecture,working mechanism and deployment process of distributed computing platform Hadoop and parallel computing framework Spark,and expand user correlation analysis research work on this basis.(2)It needs an optimal threshold in the process of generating association rule.However,setting suitable threshold needs to have certain amount of knowledge.For the foregoing reasons,The PTFP-Apriori algorithm based on Spark memory mechanism is proposed,designed and implemented.Then the optimal threshold is got by a pattern tree.The algorithm not only solves the problem of storage and I/O load in iterative operation for massive data,but gives us an appropriate threshold.The experimental results show that the algorithm has higher efficiency and scalability for fast-growing data scale compared with the traditional sequential rule mining algorithm.(3)In order to improve the accuracy and validity of Association rules,and reduce the redundancy calculation scale in the connection operation.Then we obtain temporal rules by combining the time constraint factor with the time decay function.We uses the Attribute Reduction method to reduce the dimension of the user behavior data set to reduce the computational scale and I/O load.The experimental results show that the algorithm has higher efficiency and scalability for fast-growing data scale compared with the traditional sequential rule mining algorithm.This thesis carrys out the research based on the limitations of traditional user behavior analysis methods in the application of large-scale data sets.We optimize the factor which could influence accuracy,validity and scalability of association rules mining.Finally,we employ the Spark framework to implement the parallelization and use the real data set to verify the performance of the algorithm.The research results of this paper have important theoretical and practical value for the analysis of user behavior,and explore a new method for user behavior analysis of social network.
Keywords/Search Tags:User Behavior Analysis, Association Rules, Attribute Reduction, Spark Parallel Computing Framework, Temporal Rules
PDF Full Text Request
Related items