Font Size: a A A

Research On The Pattern Analysis And Application Of Massive Behavorial Sequence Data

Posted on:2018-10-20Degree:MasterType:Thesis
Country:ChinaCandidate:L FengFull Text:PDF
GTID:2359330512983296Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
The so-called time series data refers to a set of objects or behaviors that are sorted by time.A large part of the time series data in the actual application is generated by the system for recording the user's behavior.Analyzing the user behavior patterns behind these data will have a positive impact on the identification of users and system models and the prediction of future status as well.However,the time series data that characterizes users' behavior are large and complex in form,which creates significant difficulties in dealing with these data.This paper intends to draw a useful behavior pattern from the user time series data,and put forward the proper solution for the problems encountered in this process,and apply these schemes to the real bus large data,and achieved the good results.This paper follows the idea of "mining patterns and predict the future using patterns",which is intended to excavate useful behavior patterns from user time series data and to predict the future behavior of small-scale groups and complex groups(systems)status.To this end,the article for the following key issues raised an effective solution: First,it is the time series of the problem.Due to the diversity of user behavior,making the traditional method of user behavior that there are many problems,such as dimension disaster.This paper creatively proposes a content-structure representation method of user behavior pattern,which uses the clustering and greedy algorithm to express the original user's timing data in an extremely simple sequence.This method not only retains the main information of the original sequence,but also facilitates the subsequent behavior pattern mining work.Second,it is time series pattern similarity measure.Because the user behavior sequence data points to a large number of objects,so the measurement of its similarity is essential.However,the multidimensional and non-isometric characteristics of different object behavior sequences have challenged this work.Based on the content-structure representation,this paper measures the similarity of behavior of different user sequences from the aspects of user behavior preference and preference distribution.Experiments show that this similarity measure is performing well.Thirdly,it is a prediction of group user behavior.In this paper,the trend of group behavior is predicted from the micro and macro perspective respectively.On the micro level,the behavior trend of small scale group is predicted by using the representation and similarity measure based on content-structure model.On the macroscopic side,the complex population(system)behavior sequence data is mapped to phase space by using phase space reconstruction method,and then the prediction algorithm is constructed based on this space.In this paper,when we find similarity in phase space for prediction,we propose an improved K proximity algorithm,which makes the prediction accuracy be improved greatly.Therefore,the main contribution of this paper is to propose a new "content-structure" representation method and similarity measure of user sequence behavior.Secondly,an effective method for predicting the behavior of small-scale groups and complex groups is proposed from the micro and macro angles,and the improved K proximity value algorithm is used to improve the accuracy of complex population behavior prediction.The main contribution of this paper lies in three aspects.Firstly,a new content-structure representation and similarity measure are proposed.Secondly,an effective method for predicting the behavior of small-scale groups and complex groups is proposed from the micro and macro angles.Finally,the improved K proximity value algorithm is used to improve the accuracy of complex population behavior prediction.The experimental results show that the proposed method can effectively analyze and predict the sequence behavior patterns of small-scale groups and complex groups(systems).And,this set of methods applied to the real bus data mining work,and achieved good results,the improvement of the transport system has a positive role in guiding.
Keywords/Search Tags:behavior sequence data, time series data presentation, similarity, phase space reconstruction, time series prediction
PDF Full Text Request
Related items