Font Size: a A A

Statistical Methods In Data Mining And Its Application

Posted on:2015-10-07Degree:MasterType:Thesis
Country:ChinaCandidate:G F CuiFull Text:PDF
GTID:2208330434455723Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Among different methods and techniques in data mining statistical method was the most fundamental and significant method, which also elicited many new data mining methods. As a result, the research of statistical method and its application can not only provide useful advices and guidance to data analysts, but also promote the development of technology and offer the theoretical basis for social wealth with the utilization of data feature in data mining by statistical method.Based on the study of the current software and methods of data mining, this article laid much emphasis on the statistical approaches, models and its application involved in data mining, machine learning and statistical learning mechanism have also been analyzed. Modified covering algorithm based on Bayesian theory was proposed after analyzing learning method of covering classification and the drawbacks of covering algorithm based on probability when dealing with massive data, namely replaced the sample categories in covering boundary through voting by posterior probability acquiring through Bayesian Formula, which improved reliability and stability of classification. Four aspects list as follows:1. Application of data mining methods and relative software were comparatively analyzed, which can provide alternative advice or direction for users.2. Statistical theory including statistical method and statistical model was respectively analyzed according to the process and task of data mining, which can offer references when mining the data information with statistical characteristics.3. Began from the learning mechanism of the new method in data mining, data mining method based on machine learning and statistical learning mechanism were deeply analyzed, which will help to design the new method, upon which to mine data information with statistical characteristics.4. Dealing with the difficulty of Support Vector Machine (SVM) method during data classification in large databases and the problem of incorrect classification of samples by covering algorithm and probability-based covering algorithm, the new learning-classifier machine was put forward, which was the combination of covering algorithm based on Bayesian theory and probability-based covering algorithm aiming at solving the classification of the boundary samples in the test samples. The new learning machine consists of structures of two layers:the first layer was classification model structure based on probability-based covering algorithm, while the second was Bayesian theory-based covering algorithm. This covering classification learning machine could fulfill the purpose of classifying the enormous amount of data efficiently.The innovation and character of this article was presented in the fourth aspect.
Keywords/Search Tags:Data Mining, Statistical Methods, Machine Learning, Covering Classification, Bayesian Neural Network
PDF Full Text Request
Related items