Font Size: a A A

Research And Application Of Data Mining Algorithms Based On Data Preprocessing And Regression Analysis Techniques

Posted on:2015-11-20Degree:MasterType:Thesis
Country:ChinaCandidate:X X LiFull Text:PDF
GTID:2298330434960905Subject:Detection Technology and Automation
Abstract/Summary:PDF Full Text Request
The data grows rapidly followed by the rapid development of technology in the currentsituation. Therefore, it has attracted domestic and foreign experts and scholars that increasingdaily interested in the field which triggers a wave of bid data research. The rise of new datamining techniques become one of the hot researches recently. Data mining is a process thatwe can obtain the valuable knowledge or interested thing from the big data. We research onmassive data mining techniques and algorithms in the background of the Several researchquestions on bridge health monitoring of High Speed Passenger Railway Line in NorthwestLoess Area in the thesis, and apply it to a bridge health monitoring forecast.We focus on the algorithms of time-series data mining studies for practical application inthe paper which based on these technologies and methods. It should be preprocessing databefore data mining because of the features of big data is incompleteness, with noise, containsnull values, offsite storage, and considerable, and then use a treatment method or algorithmwhich is effective to deal with the big data, the results can have a certain credibility only inthis way.The main contents of the thesis are as follows:(1) We describe the basic theory and process of data mining as well as some methodssuch as clustering analysis, association rules and classification and regression analysis thatcommonly used in the data mining. The regression analysis method is often used in datamining for values of the time series prediction, and the method of BP algorithm is a betteralgorithm for prediction, therefore, we analyzes methods and principles of BP algorithm andsome commonly used methods which is improved in the thesis.(2) We introduce the neighborhood rough set theory into data preprocessing phase, andanalyze the principle of neighborhood rough set attribute reduction firstly, then we use UCIdata sets to study its properties, and on the basis of comparative and analysis about traditionalmethods of Pearson with neighborhood rough set attribute reduction algorithm secondly, andwe focus on neighborhood rough set attribute reduction algorithm in application.(3) In order to make the algorithm better performance in data mining by means ofsoftware simulation. first of all, we analysis the performance of several improved BP neuralnetwork algorithms which commonly used deeply; And secondly, we select two betterperformance of the algorithms and based on that propose four kinds of GA-BP algorithm models; thirdly, we determine a more effective model of GA-BP algorithm; Finally, wedetermine the parameters of the model in order to make it performance best through fixed twovariables and change one variable approach which the range of parameters is known by referto the relevant thesis.(4) We use the data preprocessing methods and GA-BP algorithm that proposed above(Data Integration, Data Attributes Reduction, Data Noise Reduction, Data normalization) inthe cable-stayed force prediction of a cable stayed bridge, and the experimental shows that theGA-BP algorithm is effective.
Keywords/Search Tags:data mining, data preprocessing, regression analysis, bridge monitoring, forecasting
PDF Full Text Request
Related items