High Dimensional Multispectral Data Classification By Machine Learning

Posted on:2003-04-15

Degree:Doctor

Type:Dissertation

Country:China

Candidate:J T Xia

Full Text:PDF

GTID:1118360092466155

Subject:Circuits and Systems

Abstract/Summary:

PDF Full Text Request

The theories and methods for high dimensional multispectral data classification with limited training samples are studied,which are parts of important research contents of National 863 Hi-Tech,973 Project and Ministry of Education PhD Fund. With the development of sensor technology,multispectral sensor can collect data in as many as several hundred spectral bands at once. The high dimensional multispectral data,which features high spectral resolution,high spatial resolution,and large dynamic range,have provided luxuriant information about earth surface for people. Because the number of training samples is limited and data dimension is high,the performance of traditional pattern classification algorithms is deteriorated. In this thesis,several issues concerning the machine learning and the classification of high dimensional multispectral data with limited training samples are addressed,which are based on statistic learning theory (SLT),support vector machine (SVM) and artificial neural networks (ANN). The main work and results are outlined as follows:1. The characteristics of high dimensional multispectral data are studied,and the difficulties that deteriorate the performance of the traditional pattern classification algorithms are carefully analyzed. Applying statistic learning theory and support vector machine in high dimensional multispectral data classification,the Hughes phenomenon is mitigated and higher classification accuracy is obtained. The relation between the performance of SVM and kernel function,support vector,training set,data dimension and so on is studied.2. A fast SVM training algorithm based on boundary samples selection is proposed (BSS-SVM). This novel algorithm selects boundary samples from training samples to train SVM,instead of using normal training samples. Thus the scale of the training data set is reduced greatly and the training speed of SVM is improved enormously. Because the decision boundary of SVM is only determined by support vectors,the classification accuracy is almost preserved when other samples are omitted. A newboundary samples selection algorithm via fuzzy clustering (FCMBSS) is brought forward to accelerate the boundary samples selection speed.3. SVM is designed for binary classification problem. It cannot be used to solve multiclass problem directly. A new SVM framework (ECC-SVM) to handle multiclass problem is proposed,which uses the error correcting codes to reduce the multiclass problem to multiple binary problems. Every class endues a binary code,then a set of SVMs are used to solve the multiple binary problems. The generalization performance of ECC-SVM is analyzed,which is determined by code length,Hamming distance,coding sequence and margins of SVMs. 1-v-R SVM,which is widely used to solve multiclass problem,is equivalent to ECC-SVM with a set of special codes. So the generalization performance of 1-v-R SVM is also analyzed.4. Though Double Parallel Feedforward Neural Networks (DPFNN) has been successfully used to classify the multispectral images,the generalization performance of DPFNN has not been studied until now. This thesis studies the relationship of the generalization performance of DPFNN and weight values in theory. The result shows that the weight values of output layer neurons control the generalization performance of DPFNN. Based on this result,a new approach for improving the generalization performance of DPFNN is proposed,which regularizes the output layer weights during DPFNN learning process. The new algorithm can be used to training other multi-layer feedforward neural networks,which can improve the generalization ability of them greatly.5. Anew feature extraction algorithm in kernel space is proposed,which uses the Bhattacharyya distance as its criterion function. The data is nonlinearly mapped into high dimensional kernel space at first. Then a set of feature vectors can be found such that the Bhattacharyya distance of the classes mapped into lower dimensional feature space by feature vectors is maximized. Thus...

Keywords/Search Tags:

Remote Sensing, Multispectral, Machine Learning, Pattern Recognition, Generalization Performance, Statistic Learning Theory, Support Vector Machine, Double Parallel Feedforward Neural Networks, Feature Extraction

PDF Full Text Request

Related items

1	Multispectral Data Classification Based On Supprot Vector Machines
2	Study On Application Of Machine Learning Based On Support Vector Machine
3	Study On Least Squares Support Vector Machine And Its Applications
4	Research On Support Vector Machines And Kernel Methods
5	Research On Some Problesm Of Support Vector Machine Learing Algorithm
6	Study On Some Issues Of Kernel Machine Learning Method
7	Comparative Studies, Remote Sensing Classification Based On Support Vector Machine
8	Classification Algorithm Based On Support Vector Machine
9	Study Of Support Vector Machine And Its Application In Cancer Diagnoses
10	Support Vector Machine Learning Under Noisy And Overlapping Data