Font Size: a A A

Research And Development Of Data Mining-Based Flood Forecast System

Posted on:2007-06-11Degree:MasterType:Thesis
Country:ChinaCandidate:P WangFull Text:PDF
GTID:2178360212458603Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Data mining is a technique that aims to analyze large source data and discover the disciplinarian hidden in the data; it has been viewed as one of the most popular applications of information technology. In China, the water information system that is mainly based on OLAP cannot take full use of the data of water. Introduced data mining to this area has a significant practical importance in Flood Control and decision-making. Recently the traditional methods in the research of flood forecast are both analyzed and calculated according to the existing knowledge of different river. These traditional ways to predict flood have strong pertinence.This paper brought forward a forecast system that based on J2EE combining the spirit of data mining and some improved data mining methods supported by SAS system. This system can provide a lightweight way to predict flood, it uses the data of main complication that formed flood to predict the flood, and we need not think about different situation of the river. So it fetches up the disadvantage of the strong pertinence in the traditional ways. First this paper analyzes the data of hydrological, discusses the data preprocess techniques for hydrological time series. Filling missing values, smoothing noise data and removing inconsistent data were all adopted to get high quality data. According to the different situation of wrong data, we built two ways of Flood Forecasting: Regression analysis and Regression-Principal analysis. And in Regression-Principal analysis we further introduced the Reduction of dimension and Contribution in Principal Component into Regression analysis, whichreduces the disadvantage coursed by the wrong data.At last, we evaluate the system using the real data of Yandu river and proved the policy of choosing the predictive model: when the number of the wrong data is larger than the 10 percent of the total data, we could choose Principal Component -Regression analysis model to reduce the influence of the wrong data; otherwise we could choose Regression analysis model to get the accuracy result. This system was stabilized and general, which might be served as a useful model in this domain.
Keywords/Search Tags:Data Mining, Flood Forecasting, SAS, Time Series, Regression analysis, Principal Component - Regression Analysis
PDF Full Text Request
Related items