Research On Time Series Data Mining And Its Application

Posted on:2016-05-10

Degree:Master

Type:Thesis

Country:China

Candidate:B F Zheng

Full Text:PDF

GTID:2308330461452662

Subject:Control Engineering

Abstract/Summary:

PDF Full Text Request

In recent years, with the rapid development of the Internet and information technology, time series data has been generated in large quantities, and becomes one of the world’s top ten challenging data mining problems. It is very important for us to take advantage of the time series data, and discovery useful knowledge.Time series data is a sequence of data points, which consists of successive measurements made over a time interval. It exists in many areas such as the telecommunications industry, stock market, network intrusion, biomedical, and e-commerce market. Time series data has some specific characters, such as large volume, high dimensions, updated with time, and usually continuous. So it is hard to get good results when traditional data mining algorithms directly applied to time series data. To solve these problems, we do research on time series data, and propose MBFS (Manifold-learning Based Feature Selection) algorithm and DWSVM (Double Weighted Support Vector Machine) algorithm, and apply them to predict driving fatigue task. The main work of this paper is as follows:Firstly, to deal with complicated space and high dimensions of time series data, we propose MBFS algorithm. This algorithm combines the advantages of metric learning, manifold learning and sparse coefficient vector learning method. According to the contribution of each feature in the sample data, we select features with high contribution. ITML (Information Theory Metric Learning) method maps data to a new Euclidean distance space. Manifold learning method could find low dimension manifold in high dimension space, it helps to find the inherent structure of the data and reduces the dimensions. Compare with the traditional feature selection algorithm, experiments show this feature selection method can greatly reduce the difficulty of classification, and improve the prediction accuracy.Secondly, to overcome the difficulty of classifying unbalanced data set, we proposed DWSVM (Double Weighted Support Vector Machine) model, which based on the weighted sample and weighted sample characteristics. This algorithm is based on the contribution of classified samples. We assign different weight to small number of samples and large amount of samples. We use MBFS algorithm to calculate the weight of each feature and reconstruct the kernel method. The experimental results show that in unbalanced dataset, the performance of double weighted support vector machine is much better.At last, this paper applies this method to predict driving fatigue task. The main tasks of this project include building the experiment platform, collecting data, data pre-processing, data segmentation, feature representation, feature selection, building model and model validation. The experimental results show that this data mining system has achieved a relatively high accuracy in predicting fatigue driving task, and it meets the needs of practical applications.

Keywords/Search Tags:

time series, feature selection, manifold learning, metric learning, double weighted support vector machine, fatigue driving

PDF Full Text Request

Related items

1	The Study Of Classification Methods And Its Applications In Web Mining Based On Statistical Learning
2	Study On Some Support Vector Machine Algorithms And Their Applications
3	Research On Kinshipauthentication Algorithm Based On Feature Extraction And Metric Learning
4	Research Of Driver's Fatigue Driving Detection Algorithm
5	Researches On Support Vector Machine Learning Approaches Based On Ensemble Learning
6	Terrain Recognition Based On Sparse Description Multi-plane Support Vector Machine In Unstructured Environment
7	Research On Metric Learning Based Support Vector Machine Algorithm And Its Applications
8	Research On Time Series Prediction Based On SVM
9	A Research On Dimensionality Reduction And Classification Of Hyperspectral Image Based On Support Vector Machine
10	Research On Some Problesm Of Support Vector Machine Learing Algorithm