Font Size: a A A

Research On User Data Mining Of E-commerce Platform Based On Hadoop

Posted on:2022-07-19Degree:MasterType:Thesis
Country:ChinaCandidate:Z B WangFull Text:PDF
GTID:2518306566977109Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
The prosperous development of e-commerce has prompted a sharp increase in e-commerce platform users,and the amount of data generated by these users has shown an exponential growth trend.How to mine a large amount of user data under the e-commerce platform and derive information that is beneficial to the development of the e-commerce platform is an urgent matter at the moment.This thesis designs a collaborative strategy to improve the Apriori algorithm that combines the transaction item set optimization strategy and the half-optimized strategy that reduces the number of iterations,and parallelizes the improved Apriori algorithm under the Hadoop platform.The improved Apriori algorithm idea was applied to design and implement the user behavior analysis system of the e-commerce platform.The main work of this thesis is as follows.First,the related technologies and traditional association rule algorithms under the Hadoop cluster are studied.Second,focuses on the traditional Apriori algorithm to deal with the problem of low efficiency when the data volume is large,and proposes a reasonable improvement strategy for the problems embodied in the Apriori algorithm under the Hadoop platform:(1)The transaction in the database is compared with The transaction items form a two-dimensional array,the transaction items in the transaction item set are quantified and the quantified values are summed,and the transaction items that do not conform to the association rules are deleted,thereby reducing the transaction database and reducing the time to repeatedly scan the database;(2)Using the idea of binary search,after each iteration is completed,the length of the frequent itemsets that need to be calculated in the next iteration is calculated.Different from traditional algorithms,the binary strategy leapfrogs to generate frequent itemsets;(3)The improved Apriori algorithm is compared with The Hadoop platform is combined to achieve parallelization.Third,four sets of comparative experiments were conducted on the improved Apriori algorithm based on the Hadoop platform.The experimental results of each group all proved that the improved algorithm has a higher operating efficiency and a shorter running time when calculating massive data.This thesis uses Java Script+html technology to build the front-end platform of the user behavior analysis system of the e-commerce platform.The improved Apriori algorithm based on Hadoop is applied in the system to realize the analysis of the degree of product relevance in multiple user behavior modes.
Keywords/Search Tags:e-commerce, data mining, Hadoop, Apriori, collaborative strategy
PDF Full Text Request
Related items