Font Size: a A A

Similarity Measure And Clustering Based On The Extended Rough Set Models

Posted on:2018-05-25Degree:MasterType:Thesis
Country:ChinaCandidate:P ZhangFull Text:PDF
GTID:2348330569486482Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology which is represented by computer technology,telecommunication technology and network technology,the data is ubiquitous.And it's a hot issue on how to mine the useful and hidden information from the massive data.Rough set theory is a mathematical tool for uncertainty problems in the field of data mining.As a new soft computing method,the rough set theory attracts more and more attention in recent years.And the rough set theory is one of the hot topics in the field of artificial intelligence theory and its application in the world.In addition,the rough set theory has been widely used in many fields.Though the technology has been much more advanced than before because of the progress of science,numerous problems still have to be solved,for instance,the rough set theory does not determine how to establish a crisp set as the best approximation set of the target set;as far as the research achievements,the rough set theory doesn't include the method which can be used to measure the fuzzy similarity between the target sets and its approximate sets;and the traditional rough set theory has some limitations in the clustering analysis.The following problems are researched in this paper.Firstly,the rough set theory does not explicitly give out the method for precisely or approximately establish a crisp set as an approximation set of an uncertain target set with existing knowledge granules.Thus,the approximation set of the rough set theory is introduction.In this paper,a kind of fuzzy similarity between a target set and its approximation set is presented based on Euclidean distance instead of the similarity based on cardinality of a finite set.Then many good properties of 0.5-approximation set of a rough set are presented,and the conclusion is that the 0.5-approximation set is the best approximation set of a rough set.Secondly,the classical similarity measurement methods are analyzed in this paper.In the field of data mining,similarity measurement is the basic condition to solve many problems.For example,in the study of approximation sets of the rough sets,similarity measure is the key to determine whether the method can describe the target set in depth.For other problems in data mining,such as classification and clustering,the similarity measure is the basis in research.At present,there is few universal method of similarity measure,so it is a valuable point to choose a suitable similarity measure for different problems.Finally,the practical application of variable precision rough sets in data mining is studied in this paper.Due to the traditional fuzzy clustering algorithm is sensitive to noise points,the clustering accuracy is not high.In this paper,a fuzzy clustering algorithm based on variable precision rough set is proposed.Besides,based on the variable precision rough sets,a new fuzzy clustering algorithm is proposed in this paper.The main steps of the algorithm are as follows,firstly,the objects are divided into three disjoint regions according to the rough set theory.Secondly,in the boundary region,the objects are divided into positive and negative region by the thresholds which are determined up to the variable precision rough sets.Thus,the most of the data noise is put into the negative region.Finally,the clustering results can be obtained.The algorithm experimental results show that the algorithm is concise,efficient and scalable,and it promotes the development and wide application of extended rough set models.
Keywords/Search Tags:rough sets, approximation set, variable precision rough sets, similarity measurement, fuzzy clustering
PDF Full Text Request
Related items