Font Size: a A A

Research On Dynamic Data Mining Methods And Techniques Based On Rough Set Theory

Posted on:2017-01-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y S WangFull Text:PDF
GTID:1108330485950024Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet and information technology, the capacity in the aspects of collecting data and storing data has been greatly improved. It has accumulated large amounts of data either in scientific research area or in social life area, to analyze these data efficiently and obtain potentially useful knowledge from the data have become the common needs of all applications. Rough set theory is regarded as a new mathematical tool to deal with fuzzy, vagueness and uncertainty data. The advantage is that it does not require any priori knowledge and additional information to effectively analyze the data, finding some hidden knowledge and reveal the underlying patterns. At present, it was widely applied in many fields such as data mining, machine learning, pattern recognition, knowledge discovery and so on.In rough set theory, attribute reduction and knowledge acquisition are the hot topics in the research study. The objective of attribute reduction is to remove some redundant or irrelevant attributes on basis of keeping the same information for classification as the full original set of available features. And the objective of knowledge acquistion is to extract rules or knowledge on the basis of attribute reudction. However, in many application domains, as to dynamic, inaccurate and incomplete characteristics of data collection, if the amount of data increases to some extent, the obtained attribute reduct results from original data are impossible suitable for the current actual situation and need to handle from scratch. If we employ the static algorithms to deal with this problem, many serious disadvantages appear, such as high time complexity, and they cannot describe the local situations of data. Therefore, how to study the models and algorithms of data mining for dynamic data can promote the development of the rough set theory, which has important theoretical significance and application prospects.In this paper, we focus on rough sets as a tool to do attribute reduction and knowledge acquisition, and develop a comprehensive and systematic study on the models and algorithms of attribute reduction and knowledge acquisition for dynamic complete decision tables and dynamic incomplete decision tables. In addition, an ensemble classifier is constructed based on the rough set attribute measurement mechanism. The main research results and innovations are summarized as follows:1) For dynamic decision tables, a dynamic attribute reduction model based on information granularity is constructed, and an incremental approach for computing information granularity is discussed in detail when some new attribute set is added into decision table. On this basis, a dynamic attribute reduction algorithm is proposed using information granularity as the heuristic information. The proposed algorithm can make use of the results of attribute reduction and information granularity in the original decision table, which can effectively reduce the computational complexity, such that the results of attribute reduction have better inheritance. Finallly, the example analysis and experimental results verify the feasibility and effectiveness of the proposed algorithm.2) Since missing data is exsited in incomplete decision table, the classical rough set model is difficult for application, especially when the data in incomplete decision table is changed dynamically. Therefore, aiming at the dynamic incomplete decision table, an information granularity model based on tolerance relation is constructed; the incremental updating method of information granularity model is analyzed when the object set is added into the incomplete decision table; at the same time, combined with the results of information granularity and attribute reduction in the original incomplete decision table, an incremental updating attribute reduction method is proposed based on information granularity, which effectively improves the computational efficiency of attribute reduction.3) For the dynamic change of the objects in the decision table, how to effectively obtain the knowledge or rules from the dynamic decision table is studied. At first, the dynamic updating mechanism of approximate classification quality based on the positive region is analyzed when a single object adds into and deletes from the incomplete decision table. Through the calculation of the decision confidence with new equivalence classes for decision classes, the rules are added and deleted dynamically under meeting the requirement of the threshold. On this basis, an incremental updating knowledge acquisition method is proposed. Then, to avoid the dynamic change of multiple objects can be regarded as the cumulative change of a single object, when large numbers of objects are dynamically added and deleted in the decision table, the dynamic update mechanism of knowledge approximate classification quality is constructed based on the positive region, and a dynamic knowledge acquisition algorithm is designed in a decision table.4) Because of the technology of data acquisition or human error for missing data and other reasons, it often occurs to incomplete data. Considering the adding and deleting of objects in incomplete data, the dynamic knowledge acquisition method is studied based on the model of the approximate classification quality. At first, when the objects are added into the incomplete data, the dynamic change of the positive region and the updating method of the approximation classification quality are analyzed; then, for the deletion of the object in incomplete data, the dynamic change of the positive region and the updating method of the approximation classification quality are also analyzed. On this basis, a dynamic knowledge acquisition method is provided when the object set is added into and deleted from the incomplete data. Finally, the effectiveness of the proposed method is verified by the experimental results.5) From the perspective of the attribute measurement based on rough sets, an attribute evaluation method based on the hybrid measurement mechanism is proposed, this method is analyzed the importance of attributes from the different information granularites, and according to the characteristics of data distribution, a parameter weighting factor is introduced to regulate the attribute importance in the hybrid measurement mechanism. On this basis, an ensemble classifier is constructed based on the rough set attribute measurement mechanism. Through the experimental results and analysis, it shows that compared with the single attribute measurement criteria, the proposed method can reduce the dimension of data effectively, and the classifier has better classification performance.In a word, the main work of this paper is to research attribute reduction and knowledge acquisition when objects or attributes were changed in dynamic data. To better solve the static algorithm can’t describe the variation of the data and algorithm complex raised and so on. The main contributions of this paper is adapt to the analysis and mining of data under the reality environment.
Keywords/Search Tags:Rough sets, Attribute reduction, Knowledge extraction, Data mining
PDF Full Text Request
Related items