Font Size: a A A

Research On Clustering Algorithm And Application Of Panel Data

Posted on:2017-08-06Degree:MasterType:Thesis
Country:ChinaCandidate:D Q HouFull Text:PDF
GTID:2348330503995670Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Panel data is a data set that is composed of one or more indicators of several samples at different time points, that is a kind of multidimensional data that contains the feature of cross section data and time series data. Due to its special data structure, panel data can make full use of the data information of different time and index, which enables researchers master comprehensive sample information from different perspectives and times. The arrival of big data era and infiltration of different subjects increases the application of panel data clustering analysis. Classical clustering methods cannot be directly applied to the panel data, the relate study still needs further research. According to the existing research results and the characteristics of panel data, research of its clustering methods application is performed. In general, the content of this paper includes:(1) Based on the summary of the existing panel data clustering method, the time series characteristics of multi index panel data are analyzed, and the problem of using Euclidean distance to cluster the panel data is studied;(2) A feature extraction method of multi-index panel data is proposed, including the definition of several statistics: Absolute Quantity Feature, Variance Feature, Skewness Feature, Kurtosis Feature and Trend Feature. With these statistics as the basis, the similarity of panel data clustering objects in the index value, development trend and volatility, the distributed situation is measured;(3) The dynamic clustering algorithm is combined with the clustering analysis model of the panel data. The algorithm steps of the feature extraction based on feature extraction is proposed. The clustering method result of panel data is evaluated using the inner-class and inter-class distance, and sample distance standard deviation within the cluster domain;(4) Using the multi-index panel data clustering model proposed in this paper, the traffic safety situation of the last 10 years in China is analyzed. The characteristics of traffic safety situation in each category are classified into 5 classes, and the corresponding countermeasures and suggestions are put forward.
Keywords/Search Tags:Panel data, Clustering, Feature Extraction, Traffic Safety
PDF Full Text Request
Related items