Font Size: a A A

Research On Knowledge Discovery Based On Rough Sets

Posted on:2005-12-07Degree:MasterType:Thesis
Country:ChinaCandidate:B TangFull Text:PDF
GTID:2168360122492618Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of database technology and the coming of information era, large amount of data is accumulating in many industries. The volume of the databases are zooming, especially in aviation industry, meteorological industry, medical industry and agriculture. Analyzing these data and obtaining knowledge is becoming more and more badly needed. But we still lack efficient intelligent methods. In the meantime, the research and application of expert system which is one branch of artificial intelligencehas has gotten a great development, but expert system also has the bottleneck of obtaining knowledge. So a new research field named knowledge discovery, comes into being. Because usually the information containing knowledge is stored in database, knowledge discovery in database (KDD) becomes the focus of knowledge discovery.Rough sets is a new mathematical tool introduced by Pawlak to deal with uncertain and fuzzy information, based on which many problems in knowledge discovery can be resolved. The paper introduced systematically the definition of knowledge discovery, the process of knowledge discovery, the technologies and methods of knowledge discovery, the function of knowledge discovery, the problems faced by knowledge discovery, the thoughts of rough sets, the research and development of rough sets, the conceptions in rough sets and the application of rough sets in knowledge discovery. Then the application of rough sets in knowledge discovery is studied in the paper, the main results follows.1. The effect of the attribute reduction algorithm introduced in the reference [3] of chapter three is not satisfied, which is demonstrated through experiment in the paper. The paper discusses the reason first, then introduces another improvement based on sorting and auxiliary space used to save the information obtained from attributes to Jelonek's attribute reduction algorithm, which reduces the time complexity of Jelonek's attribute reduction algorithm further than in [3] whilegetting the same result as Jelonek's attribute reduction algorithm. The limitation of real computer systems is considered, so strategies with some difference are adopted for decision tables of different sizes and the time complexity of the attribute reduction algorithm is also different. Because the abroad application of rough sets, the efficient attribute reduction algorithm introduced in the paper has important application value.2. The paper discussed the soundness of heuristic attribute reduction algorithms from the following two aspects: the two causes of redundant attributes exist in their result and the their result may not be minimal. Examples of the two aspects of three representative heuristic algorithms are given through structured methods respectively. A new attribute reduction algorithm based on discernibility matrix is introduced in this paper, which need not the step to get rid of redundant attributes and can avoid the influence of unimportant attributes and the interference between attributes and is tested through expriments.3. A new kind of inconsistence in ordered decision tables not mentioned in current literatures is introduced in the paper, which can not only further sophisticates the disposal of ordered decision table but also enriches the concept of inconsistence in rough sets. Sorting and classifying are two basic kinds of knowledge and mining ordering rules is a novel thought. The algorithms for ordering rules is discussed deeply and improved in this paper.4. Based on rough sets, we develope a knowledge discovery system as a part of IDSS used in a platform developed for electronic government affairs, which has been appraised by the Science and Techology Bureau of Hefei and put into use to boost intelligent elec government and has satisfactory performance. And the development of the knowledg discovery system is also a part of the project from the National Natural Science Foundation(60273043).
Keywords/Search Tags:rough sets, discernibility matrix, proximity quality, attribute reduction, ordered decision tables, ordering rules, dominance relation, overfitting, inconsistency
PDF Full Text Request
Related items