Font Size: a A A

Research On Key Application Technologies Of Mobile Communication Data Mining

Posted on:2016-07-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z P LiuFull Text:PDF
GTID:1108330503476015Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the continuous development of mobile internet intelligent terminals, mobile networks and application services, the power of mobile data collection and transmission has been greatly improved. The number of mobile data applications is increasing. Typical applications include community detection, disaster rescue, location prediction and sentiment analysis. These mobile applications gradually change daily lives of people. In recent years, mobile data mining has become a hot topic of data mining field.This dissertation focuses on mobile data mining algorithms, which include malicious phone call detection and mobile social influence computation, mobile trajectory outlier detection, password strength analysis and the implementation of mobile communication data storage. Its purpose is to utilize mobile data collection to provide people with better mobile internet services. The main contributions of this dissertation are summarized as follows:1. Malicious phone call detection is studied. We propose a malicious communication node detection algorithm Call Log Rank(CLRank), which takes a call log file as an input, to analyze the characteristics of mobile call patterns. The algorithm deals with the task of mining malicious communication nodes as a classical classification problem, partitions call logs into several mobile call log segments based on a specified time interval. The algorithm builds a time-based communication social network for each mobile call log segment, detects potential malicious nodes with a ranking and classification method. Compared with existing methods, our model just uses link information, thus protects user privacy to the maximum extent. Experimental results show that CLRank can detect malicious nodes from call logs automatically, dynamically and effectively.2. Computation of mobile social influence is studied. We propose a Time-based Influence Graph(TIG) model for SMS log data. It is a global node influence computation algorithm, which transforms SMS log data into contact sequence, takes temporal information of SMS data into consideration, computes influence among mobile nodes by a dynamic and time-based ranking algorithm. We also propose an EdgeRank based Influence Graph(ERIG) model for call log data. It is a local node influence computation algorithm. The algorithm transforms call log data into interval graph, calculates time-based influence values dynamically, and ranks influence of mobile nodes accordingly. Experimental results show that TIG and ERIG can calculate mobile node influences automatically and efficiently.3. Mobile trajectory outlier detection is studied. Existing trajectory outlier detection algorithms adopts distance-based method to calculate the distance between the trajectory partitions. The user needs to specify a global distance threshold, applying these algorithms to locally dense trajectory results in poor performance. Besides, these algorithms are extremely sensitive to parameter values. We propose a Density-based Mobile Trajectory Outlier Detection(DMTOD) algorithm, which is composed of two stages: partition and detection. During the partition stage, each trajectory is partitioned into a set of t-partitions using the trajectory partition algorithm. During the detect stage, density-based trajectory outlier detection algorithm is applied to compute the results. Experimental results show that DMTOD performs mobile trajectory outlier detection better than previous algorithms.4. Password setting habits of Chinese mobile internet applications are studied. There lacks a large scale study of network password habits of Chinese mobile internet applications. We collect over 20 million password data of Chinese network users which were disclosed by network intruders. We analyze password setting habits through statistical and machine learning methods. Currently, there lacks a password dictionary for security evaluation of Chinese mobile network applications. We propose a Training set Extension Based Dictionary(TEBD) to generate password dictionaries. TEBD utilizes Probabilistic Context Free Grammar(PCFG), constructs four-tier Training Set Distribution Tree(TSDT), uses genetic operator based algorithm to generate new password dataset. Experimental results show that this algorithm is an efficient user dictionary generation algorithm to test the security level of user passwords. 5. Mobile communication data storage on a PC is studied. It is possible to use mobile communication data storage when the amount of data does not reach PB scale. It has the characteristics of easy implementation and low cost. We construct and optimize a Mobile Communication Data Storage(MCDS) platfrom on a PC. MCDS is based on GraphChi, and improves GraphChi from 3 aspects: data format, sharding mechanism and memory replacement algorithm. Experimental results verify the effectiveness of MCDS, and it provides a feasible experimental environment for mobile communication data mining.
Keywords/Search Tags:mobile data, phone scam detection, social influence, position-based mining, password security, Mobile Communication Data Storage
PDF Full Text Request
Related items