User Clustering Research Based On Cellular Network Data

Posted on:2020-03-23

Degree:Master

Type:Thesis

Country:China

Candidate:Z Yuan

Full Text:PDF

GTID:2428330572976410

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Along with the development of the mobile Internet,the amount of data in the cellular network is exploding,and the revenue of traditional services such as voices of telecom operators is shrinking,and the viscosity of users of their own products is also declining.How to use the massive user data in the cellular network to mine valuable user behavior patterns and establish appropriate user behavior feature models to optimize the user's product experience and improve the marketing accuracy has become a hot research topic in recent years.As an unsupervised learning algorithm,clustering technology is very suitable for exploring hidden patterns in data.Based on the massive cellular network service data collected from operators,this thesis conducts user clustering research from time dimension and space dimension respectively.The main work of the thesis includes:First,the extraction of heavy users.This thesis finds that the user's traffic usage is very unevenly distributed by drawing the Lorentz curve of user traffic.About 21.2%of the heavy users consume 81.85%of the traffic;then this thesis extracts some heavy users who have important influence on the cellular network.The research compares with the average user in terms of traffic usage,active duration,number of services,and mobility.The results show that heavy users exceed the average user in terms of traffic usage,active duration,and number of services.There is no significant difference between sex and ordinary users.Second,clustering of users in time dimension to explore the traffic usage pattern of heavy users.This thesis studies the user clustering of time dimension and the feature vector that can represent the user's preference for different time periods.The eigenvector is created by dividing the flow rate into five time periods according to the law of life 24 hours a day,calculating the flow ratio of each time period to the whole day,and dividing by the number of hours of the time period to form a feature vector of the user.Next,the K-means algorithm is selected for clustering,and the optimal cluster number K is 4 according to the three evaluation indicators.The four types of users prefer the four periods of bedtime,leisure,work,commuting/dining.Use more traffic to provide reference for operators to optimize their networks and accurately market them.The results show that the specific clustering process and the scheme of creating feature vectors proposed in this thesis can effectively mine the traffic usage patterns of different users.Third,clustering of user groups in spatial dimensions to discover groups of users with potentially high value.This thesis studies the clustering of user groups in spatial dimension and the feature vectors that can represent the value of user groups.User groups are divided into groups:the same users(the base stations that use the most traffic during a period of time)are divided into the same group.The user group feature vector is created by discretizing three consecutive attributes of each user in the group�data traffic,mobility(number of visited base stations),and number of service types.Each of the discretization results is used.The user projects into a subspace in the three-dimensional space,and the proportion of users falling in each subspace in the group is used as the feature vector of the group.After evaluation,it was found that the best results could not be obtained.After excluding the reasons for improper clustering algorithm selection and parameter setting,the feature vector is re-created,and only two dimensions of data traffic and mobility(the number of visited base stations)are used,and the hotspot base station is extracted for research.The clustering result evaluation finds the most.The cluster number K of clusters is 3 or 4.When analyzing the clustering results,a group of users with potential high-value users is found.The results show that the specific clustering process and the scheme of creating feature vectors proposed in this thesis can effectively mine the user groups with potential high-value users.

Keywords/Search Tags:

Cellular Network, Service Data, User Clustering, Feature Engineering

PDF Full Text Request

Related items

1	Spatio-temporal Analysis And User Interest Mining Based On Cellular Network Data
2	Analysis And Application Of User Mobility Based On Cellular Data Network Traffic
3	Research And Application Of User Clustering Method Based On Mixed Type Data Analysis
4	Research On Cellular Network Resource Allocation Based On Service Characteristics
5	Network Users Based On Web Log Clustering And Implementation
6	Research And Application On Data Clustering Using Cellular Automata Based On Cell Clustering
7	A Methodology for Quality of Service Evaluation in 4th Generation (4G) Long Term Evolution (LTE) of Cellular Data Networks
8	User Clustering And KQI Analysis In Cellular Networks
9	Constructing A Network Data Oriented Mining And Visual Analyzing System Based On Service Engineering Concept
10	User Behavior Analysis And Prediction Baesd On Big Data In Cellular Network