Font Size: a A A

Based On Decision Tree Classification Algorithm Of Parallel Research And Application

Posted on:2008-08-29Degree:MasterType:Thesis
Country:ChinaCandidate:X F FangFull Text:PDF
GTID:2208360215972140Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Along with widespread applications of computer in the society each realm and highly developed technique of information and network, data generated are increasing day by day.How to manage and make use of these data of databases, how to discover potential knowledge, we need some new and potent means to tidy up the data source and carry out analysis, find new knowledges and develop the potential of them. Therefor, data mining technique emerged as the times require and develop swift.Data mining, also called as knowledge discovery in databases (KDD), is the course of distilling latent useful information and knowledge from a mass of incomplete fuzzy stochastic data.Data mining is a burgeoning research and application area based on database techniques, which synthesizes multidisciplinary productions, such as logic statistics, machine learning, fuzzy theory and visual computing, in order to acquire usable information from database .It has achieved increasing attention and broadly interest in the past years, and has been applied to finance, insurance, communal facilities, government, education, telecommunication, software development of the bank, transporting, etc.Classification is an important technology in data mining. Classification is a two-step process. In the ftrst step, a model is built describing a predetermined set of data classes or concepts, and then, the model is used for classdification. Decision tree classification is a very available classification method. So far, people put forward many kinds of decision tree classification algorithms. Each has its strong point on executing rate, expansibility, intelligibility of output and accuracy of classification. However, these algorithms still have some shortages. Further optimizing decision tree algorithm will not only help to perfect its theory, but also its popularization and application.This dissertation systematically, deeply, roundly and detailedly studies and analyses the data mining technique, especially decision tree. The main contents are listed as follows:(1) Description in brief of the data mining technique. This paper introduces the basic concept, process, classification, key technique typical applications and challenges faced.(2) Research of decision tree algorithms. This paper introduces normal process of decision tree classification, decribes and analyze constructing and pruning of some typecaldecision tree algorithms, discusses their comparisons, and probes into direction of decisiontree.(3) Research of optimizing decision tree classification algorithm. This paperintroduces the time and space expansibility of decision tree algorithm and studys how toadvance expansibility. And the paper discusses the study of parallelizing decision treeparticularly.(4) A scheme ofparaUel decision tree classification. This paper proposes a kind ofparallel decision tree classification scheme,which not only can improve parallelism,butalso reduce system I/O requirement and volume of communication between computers.(5) Application in market segmentation of decision tree. This paper applies theparallel decision tree proposed in the paper in concrete market segmentation example. Inallusion to cell phone market, this paper analyzes the data of many customers with paralleldecision tree, executes market segmentation, completes the data mining process from datato rules, mining the characteristics of different customers, helps decision-maker makedecision. This paper is carrying on a deep explore and a profound attempt in the way thattheory is used in commerce.
Keywords/Search Tags:data mining, elassification, parallel decision tree, market segmentation
PDF Full Text Request
Related items