Supervised and unsupervised machine learning for pattern recognition and time series prediction

Posted on:2009-06-13

Degree:Ph.D

Type:Dissertation

University:The University of Texas at Dallas

Candidate:Bean, Kathryn Brenda

Full Text:PDF

GTID:1448390005452108

Subject:Statistics

Abstract/Summary:

The problem of empirical data modeling relates to many engineering applications, such as classification, prediction, and pattern recognition. In Chapter 1 I will introduce Machine Learning and Data Mining approaches from a Computer Science and Statistics perspective. I have developed a new clustering method DBBIRCH (Density Based BIRCH) that combines the features of density- and distance-based clustering algorithms. This method is described in Chapter 2 and is based upon (Bean K, 2007). My algorithm is an on-line type of algorithm and has a running time asymptotically equal to BIRCH under some realistic assumptions. To improve the accuracy of "distance-based" algorithms, robust statistics (trimmed mean) are used. The density-based feature of this algorithm is achieved by combining initial clusters into networks of density-connected clusters. DBBIRCH provides a fast and precise clustering method to mapping data points to their non-spherical clusters. My algorithm is easily modified to perform parallel clustering of large datasets using grid computing. My prototype program used breast cancer (UCI Machine Learning Repository) and synthetic datasets to support my conclusions.;I have developed a new framework to improve the performance of a partition-typed algorithm for the clustering of datasets with missing attributes. Chapter 3 describes this framework, and this approach is based on (Bean K., 2008). I have incorporated CLARA, PAM and K-means within a framework that remains general enough to allow other clustering algorithms to be used. Initial clustering is performed using a very fast algorithm: BIRCH. This approach was implemented to determine input parameters for a more accurate algorithm and to make the prediction of missing attributes more efficiently.;Using a neural network model for flood predictions is one of the most popular approaches. This technique, however, has a drawback related to the uncertainty of an optimal structure. I propose an algorithm for neural network pruning to create a Neural Network with Auto- and Cross-Correlation Models (NN-ACC). I believe this approach can determine the best neural network input. A forecasting framework for the presented NN-ACC model is constructed to perform calculations for a real-world case study (Derwent catchment of Upper Derwent). According to (Dunham M., 2004), NN-ACC gives a much better result than EMM and RLF.

Keywords/Search Tags:

Machine learning, NN-ACC, Neural network, Algorithm

Related items

1	Research On Intrusion Detection Model And Algorithm Based On Artificial Neural Network
2	Research On Machine Learning Algorithm With Environmental Data Prediction
3	Theory And Simulation Experiment Study On Learning Algorithms Of Neural Network And Support Vector Machine
4	Design And Implementation Of Optimization Method For Network Energy Consumption Based On Machine Learning
5	Theory And Applications Of The Soft Sensing Technology Based On Machine Learning Algorithms
6	Research And Realizing Of Technologies Forecasting Algorithm Based On Patent Data
7	Supervised and unsupervised machine learning for pattern recognition and time series prediction
8	Application Of Machine Learning In Speech Recognition And Image Recognition
9	Research On Regularized Extreme Learning Machine Based On Genetic Algorithm And Convolution Neural Network
10	Classification Algorithm Design Based On Modified Fuzzy Neural Network And Extreme Learning Machine