Font Size: a A A

Research And Implementation Of Clustering Algorithm For Massive Mobile Internet User Behavior

Posted on:2020-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y K LuFull Text:PDF
GTID:2428330575956524Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the popularity of smart phones and the rapid development of mobile communication techn ology,the number of mobile Internet users is increasing rapidly and the average time that people spend on mobile Internet is also growing.Massive mobile Internet users is generating a large amount of user behavior data when they are network browsing online.Through in-depth analysis and mining of user behavior data,it is of great significance for the construction and planning of mobile Internet services to find the implicit law of Internet user behavior.Clustering algorithm is often the first step in analysis and mining of unknown data.Faced with massive mobile Internet user behavior data,how to choose an efficient and accurate clustering algorithm is one of the hot topics.The main contents of this thesis are as follows:Firstly,based on the collected behavior data of mobile Internet users,we make an overall analysis of mobile Internet users and their Internet preferences.According to the analysis results,we classify visited websites into 12 categories and construct a mobile Internet user behavior model based on the traffic,number and time of request for 12 kinds of websites.Based on the user behavior model,a new mobile Internet user behavior dataset is generated.Secondly,we study three classical clustering algorithms.Faced with the characteristics of high dimensionality and large amount of data of mobile Internet users'behavior,we make a parallel improvement of the three algorithms based on distributed computing framework Spark.Thirdly,by comparing the external evaluation index,internal evaluation index and distributed computing acceleration ratio of the three algorithms,we choose the clustering algorithm with the best effect.Based on this algorithm,we briefly analyze the behavior of mobile Internet users.Finally,the algorithm is integrated and landed.Fourthly,facing the continuous online high-speed user behavior data flow environment,we study and implement a flow-based clustering algorithm based on distributed flow computing framework Spark Streaming,and we briefly analyze the parameters selection and clustering results of the algorithm.The mobile Internet user behavior data in this thesis is all derived from the real data of a province in China.The clustering analysis results of users are consistent with the actual situation,which is of great significance to the planning and development of mobile Internet services in this province.
Keywords/Search Tags:mobile internet, user behavior analysis, clustering algorithms resaerch, Spark
PDF Full Text Request
Related items