Research On Discretization Of Continuous Attributes

Posted on:2008-01-17

Degree:Master

Type:Thesis

Country:China

Candidate:J F Ji

Full Text:PDF

GTID:2178360212995647

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Because of the rapid dissemination and broad prevail of computers, it produce the vast amount of data and information. It uses a lot of study means and algorithms to obtain the hidden and useful knowledge in database. A lot of study algorithms require the input attribute value is discretized, and it has drawn forth a lot of discretized methods to discretize continuous attributes, this can be the interval determined by experts, or can according to a certain principle to input space divided by domain experts, providing the discretization by the cut points. The methods can generally be divided into supervised and unsupervised method; and it can be divided into global methods and local methods, according to discretize all the continuous attributes at the same time or discretize a single attribute at one time; it also can be divided into static methods and dynamic methods according to divide the interval before the classification or during the classification.The common discretization strategies are: equally space division strategy, adaptive method, equal interval of frequency, based-on the class information entropy methods. It is hard to find a direct and easily understand discretization result via various discretization means.In this paper, the common discretization algorithms and methods are introduced first. We propose a new method to obtaining discretization of the continuous attributes, based on the obtaining of the linguistic summaries and the linguistic rules from the database. This process has the advantages below:(1) It is easily to read the discretization result. It is hard to find the knowledge hidden in the database, if read the database directly. And the discretization result can be easily understand based the methods proposed in this paper;(2) All the results we got have the certain support degree, and we can select the language and rules which support degree are higher than the given threshold, as well as to fit the various requirements;(3) The obtaining progress has higher AI capacity, it requires the given threshold of every language term, and it can give us summaries and rules in nature language.In the progress of the obtaining, it makes the membership function by the experts based on the distribution of the attributes values, or it can made the proper membership function of language terms, and it can get the optimizing discretization via GA. Since we have the membership function of every language term, we compute the membership degree of every object. We collect the object which its membership degree is higher than the given threshold. In the same way, we can get another object set of other language terms. We compute the intersection set of the object sets, and the set is satisfied special conditions. The summaries and the rules are the sentences that describe the object set. And obtaining rules from database is the similarly the progress we discuss previous.In the progress of discretization, the Iris database is the example that we always use. And we use Iris database to obtain linguistic summaries and rules, at last, we use the rules to judge some objects, and we get better result to support the rules.

Keywords/Search Tags:

Information system, Continuous attributes, Attribute discretization, Language summary, Language rule

PDF Full Text Request

Related items

1	An Algorithm For Discretization Of Continuous Attributes Based On NBC Clustering In Rough Set Theory
2	An Algorithm For Discretization Of Continuous Attributes Based On Nbc Clustering In Rough Set Theory
3	Study On Comparison Of Discretization Algorithms Of Continuous Attributes
4	A Study On Rough Set Theory And Discretization Of Real Value Attributes
5	Research On Discretization Methods For Quantitative Attributes
6	VPRS Based Approaches For Discretization Of Continuous Attributes And Data Preprocessing
7	Application Of Rough Set And SVM In Discretization Of Continuous Attribute
8	Application Of Rough Set And Svm In Discretization Of Continuous Attribute
9	Radar Target HRRP Recognition Based On Rough Set Theory
10	A Study For Discretization Of Real Value Attributes Base On Rough Se Theory