Font Size: a A A

A Study On Data Classification Based On Neural Network

Posted on:2008-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:S B ChaiFull Text:PDF
GTID:2178360242467273Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Data classification is one of the major tasks of data mining. Basically, data classification is made up of two processes. First, with some kind of classification algorithm, a classification model can be set up based on the training samples. Then the model is used to predict the class label of tuples. So the performance of classification depends on the quality of classification algorithm. Currently, there are three kinds of classification algorithms, which are named decision tree induction algorithm, Bayesian classification algorithm and feed-forward neural network classification algorithm.Feed-forward neural network uses error back-propagation algorithm, also called BP algorithm, to finish the learning task and the classification. BP algorithm firstly propagates the working signal by counting the active value of each neuron from hidden layers to outputting layer. If the final outputs of the network can't be accepted, then BP algorithm will compute the error of each neuron and propagate these error signals from output layer to hidden layers to adjust the weights and thresholds of the network. The most excellence of BP algorithm is the high speed of classification. However, BP algorithm sometimes will trap into local minimum, where the classification accuracy is evidently weakened. In this paper, the fundamentally reason why BP algorithm may trap into local minimum will be analyzed by combining the coding mechanism of attributes in data tuples and self-learning mathematical theory of BP algorithm. And based on that reason, a developed BP algorithm called LMDBP is advocated. Before adjusting weights and thresholds LMDBP algorithm will check whether the condition for local minimum has existed. If it has, only the training samples which can counteract those conditions are learned to adjust the weights and thresholds. The result of the experiment using LMDBP algorithm to solve the MONKS classification problem indicates that LMDBP algorithm can finish the classification task without local minimum and keep high classification accuracy. However, there is still another problem for BP and LMDBP algorithm. Sometimes, both BP and LMDBP may go into a flat weights area, where the adjustment of weights and thresholds can only take a very slight decrease of energy function and eventully can't get a converged network. This problem needs to be solved in the future.A new kind of feed-forward neural network called wavelet neural network has been developed in recent years. Wavelet neural network contains two kinds of information dealing cell, wavelon and neuron. Wavelon maintains the theories of wavelet analysis can be used in wavelet transformation, and neuron makes sure that the self-learning characteristic of neural network is reserved. Based on the previously-adovated learning framework of wavelet neural network, a particular description of the learning algorithm for Gaussian wavelet network is advocated in this paper. Using Gaussian wavelet network and its learning algorithm, both the experiments of classifying two different kinds of samples made up of discrete-valued attributes and continues-valued attributes are carried out. The results of the experiments indicate that wavelet network is capable of classifying the samples made up of continues-valued attritutes successfully. But the ability of classifying the samples made up of discrete-valued attributes is very limited.Finally, considering the lacking of neural network based classifier for the research of data classification, a detailed introduction about how to design and develop a data classification researching platform, which has been implemented by Java technology.
Keywords/Search Tags:Data Classification, Neural Network, BP, LMDBP, Wavelet Neural Network, Data Classification Researching Platform
PDF Full Text Request
Related items