Font Size: a A A

The Knowledge Discovery Of Attribute Partial-ordered Theory Based On The Discretization Of Continuous Attribute

Posted on:2017-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:Y R KangFull Text:PDF
GTID:2348330536454063Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
The widespread use of the rapid development of computer technology and database system,makes use of information technology to produce and collect data significantly improved.A variety of information perception and acquisition of equipment to bring people into the era of big data.Therefore,by what means and technology found very valuable knowledge and rules from a mass of data,has become the key problem of cutting-edge technology to solve.Data mining and machine learning as an important means of data processing,has become a hot topic in the research.However,a lot of knowledge discovery and data mining algorithms require attributes are discrete values obtained in real life data is often continuous attributes,it is necessary to discretize the continuous attributes.The main contents are listed as follows:1)Based on the UCI data set of knowledge discovery as the foundation,through the current method of discretization of continuous attributes discretization of the attributes in the dataset,the expression in the theory of formal concept analysis in the form of background as the means2)the discrete processing results form two value form background,in the form of background and hierarchical optimization attribute partial order structure graph generating method as the core,the structure properties of ordered graph constructed by different sets of data,knowledge extraction rules,to compare through the distribution features and class labels and data sets,the discretization scheme evaluation.3)Continuous data discretization method of existing mainly in the treatment of low dimension,when continuous attribute dimension becomes higher,there are higher requirements for the efficiency of the algorithm,this paper proposes local linear embedding algorithm-a discretization method based on nonlinear dimensionality reduction of high-dimensional data improved(LLE),the high dimensional data projection to the low dimensional space,effectively protect the original data structure geometry.This article through to reduce the dimensionality of the UCI data set and discreteprocessing,has a higher accuracy of knowledge,simplify the complexity of knowledge rules are important rules of knowledge extraction and visualization of large data.
Keywords/Search Tags:attribute partial ordered theory, continuous attribute, discretization, knowledge discovery, high-dimensional data
PDF Full Text Request
Related items