Font Size: a A A

Data Mining Research And Applications Of The Classification Algorithm

Posted on:2004-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y LiuFull Text:PDF
GTID:2208360092981631Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the application of Database and the development of Internet, accumulated data are exponential increasing. For these data people are not satisfied with the traditional methods of queries and statistics, but want to find deeper regulations to provide effective decision to science and research works. So data mining technology that apply machine learning to large database to acquire useful information from a lot of data is developed.Data mining (DM) or knowledge discover database (KDD) is to discover useful information and potential knowledge from plentiful and uncompleted and noise and fuzzy and random data which are hided and not known by people. These discovered knowledge might be used to manage information and optimize queries and make decision and control procedure and maintain database and so on. So data mining is a very valued new area of database research area, and it is a crossed subject that adopts theory and technology of database and artificial intelligent and machine learning and statistics and so on.Classification is a very important task in data mining and extensively applied to commerce at present. The destination of classification is to learn a classification function or classification model that can map a data item to a preassigned class. The researcher of machine learning and expert system and neural biology provides a lot of classification methods. This paper does some research works about classification algorithm in data mining. Classification algorithm is divided to eager and lazy and total research works are based on this divide.The main work of the thesis:1. The base technologies of classification in data mining are introduced. These technologies include the procedure of classification and the preprocessing of classification data and compared and evaluated criterion of classification methods. Several of typical classification algorithms are compared which are decision-tree and k-nearest neighbor and neural network algorithm. Then the emphasis of the paper is induced that divide the classification to eager and lazy and the research of classification algorithm in data mining is based on this divide.2. A lazy decision-tree algorithm that comes from the idea of lazy classification based on model is researched on the base of the research of the traditional decision-tree. In traditional decision-tree, the concepts and advantages and disadvantages of decision-tree are presented, and the application and research situation of decision-tree are analyzed. Appling to web environment a web application used lazy decision-tree algorithm that comes from the idea of lazybased on model classificaton is developed. And the practical run shows this method acquired better grade.3. Neural network is deeply researched as representation of eager classification. Perceptron is selected. At first the creation of typical perceptron model and its learn algorithm are introduced. Then on the base of the principal and geometrical presentation of typical perception model, the limitations of typical perceptron model are studied. This limitation is that perceptron learn algorithm can be used only when data are linear separability. To resolve this problem, expanded perceptron models are research.4. Algebra hyper surface neutral network is a kind of expanded perceptron model. This model is an emphasis of this paper. At first the creation of this model and its geometrical presentation are introduced. Then it's learning algorithm is accomplished and test's results and innovation of program are presented. At last the further aims are provide base on test's conclusion. This model is potential to resolve nonlinear separability problems; especially it adapts to classify high-dimmension data. Adaptive raise degree computer method is the innovation of research. Researches show that success rate of creating model raise after using the adaptive method. But it exists the limitation of memory for high-dimension data. So a deeply research will be continued.
Keywords/Search Tags:Data Mining, Classification, Eager Classification, Lazy Classification, Decision Tree, Perceptron, Algebra Hyper Surface Neutral Network
PDF Full Text Request
Related items