Font Size: a A A

Research And Implementation Of Tourist Identification System Based On Telecom Big Data

Posted on:2020-11-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y MaFull Text:PDF
GTID:2428330575456540Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of mobile Internet and the further popularization of mobile devices,a large amount of trajectory data containing useful information is generated,analyzed and utilized.GPS data has been widely studied because of its high precision,but it is not easy to obtain due to privacy issues,and the amount of data is small.In comparison,the telecom data is wide coverage with many user groups oriented,which is helpful for analysis and research of mobile Internet user behavior.This paper studies call details data of telecom data,analyzing users'trajectory and behavior.Moreover,users' identity can be classified and identified.The main work of this paper is as follows:(1)User activity semantics are identified based on the call detail record(CDR)data,and user activity sequences are clustered and analyzed.In the paper,the hidden Markov model is used to analyze and identify users'activity represented by their CDR data,combining with local geographic information,and generate users' activity sequences.Users are clustered by using the self-organizing mapping algorithm based on their daily activity characteristics.Users' identity in different cluster is analyzed in detail according to the actual situation.In the end,users are divided into four categories:These are office workers,self-employed,tourists and unemployed residents.(2)Generate user classification rule model based on user activity sequence.In this paper,users' class label is generated according to their clustering result.User classification rule model based on user activity sequence is generated by using algorithm of Mining Sequential Classification Rules with labeled user activity sequences.Considering of the time characteristics in the user activity sequence data we use,the original algorithm is improved to generate more accurate classification rules when used in the sequences containing time information.(3)Complete the framework design and implementation of the user identification system under the background of tourism calculation.The system is based on the original CDR data,combined with the local POI data,to complete the classification and recognition of user identity through the cooperation between the modules in the system.The system also displays the statistical characteristics of identification results using visual interaction in the web page built by Django,including the distribution of user identity categories,the statistical characteristics of different categories of users and the distribution of user identity classification rules in each category based on activity sequences.
Keywords/Search Tags:user trajectory, call detail records, activity sequence, self-organizing map, user classification rule
PDF Full Text Request
Related items