Font Size: a A A

Research On Technologies Of Data Cleaning And Support Vector Machine

Posted on:2016-08-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y HanFull Text:PDF
GTID:2308330479999161Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Due to the frequent loading and refreshing from various data source in data warehouse, there are a lot of problems, such as missing data, and so on. Therefore it’s necessary to data cleaning. Support Vector Machine(SVM) has achieved a good application results in classification. However, its classification accuracy is influenced by penalty parameters and kernel parameters. In order to improve the effect of null values cleaning and classification recognition accuracy of SVM, the method of filling missing data based on Compressed Sensing and the Bayesian Support Vector Machine based on semi-definite programming are studied in this paper, which are applied in the application of oil well logging. The main work and innovations are as follows.(1) Research on data cleaning based on Compressed Sensing. For the collected data is always influenced by noise such as missing values, this data quality problems affect the result of decision. In order to improve the effect of null values cleaning, the Orthogonal Matching Pursuit(OMP) algorithm based on Compressed Sensing is adopted to data cleaning, i.e. the missing part of data are reconstructed in accordance with the sparse of original data. The experimental results show the applied effect of null values cleaning based on Compressed Sensing is remarkable.(2) Analysis of Support Vector Machines based on Bayesian rules. The classification recognition accuracy of SVM is influenced by penalty parameters and kernel parameters. In order to obtain better classification recognition result, it is needed to select proper the relevant parameters for SVM. Therefore, according to Bayesian theory framework, it is analyzed that penalty parameters and kernel parameters for SVM inferred by Bayesian rules. The simulation results show this kind of Bayesian SVM can get reasonable classification recognition effect.(3) Research on Bayesian Support Vector Machine based on semi-definite programming(SDP-BSVM). In order to further improve SVM classification recognition accuracy, the multi-kernel SVM model based on the thought of multi-kernel function is built, i.e. several kernel functions are combined linearly, which firstly parameters of single kernel SVM are inferred in accordance with Bayesian theory, then semi-definite programming is used to solve optimal coefficient of combination for multi-kernel function. The simulation results verify the effectiveness and superiority.(4) The application research on oil and gas layer recognition. Oil and gas layer recognition is an important part in well logging. In order to solve oil and gas recognition problems for complicated well, a kind of oil and gas layers recognition model based on multi-kernel SVM is designed, in which SDP-BSVM is adopted to recognize the oil and gas layer of two typical oil well in China. The experiment results show its application effect is remarkable, and this method has a broad application prospects.
Keywords/Search Tags:Data Cleaning, Compressed Sensing, Multi-kernel Support Vector Machine, Bayesian Theory, Semi-Definite Programming
PDF Full Text Request
Related items