Font Size: a A A

The Decision Tree Algorithm And Its Application On Employment Of Undergraduate Students

Posted on:2010-11-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y JinFull Text:PDF
GTID:2178360302468638Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Some interesting and potential available knowledge can be extracted from data sets using data mining and expressed in an understandable form. Classification is an important branch of data mining, which can find out the models of data types or definitions to predict the class-unmarked object types. Decision tree is an inductive learning technique based on instances.This thesis compares and analyzes the typical decision tree algorithms, the ID3 algorithm and C4.5 algorithm are described specifically. Decision tree algorithms require that the data set of all values are accurate value before ID3 algorithm, missing data will reduce the performance of the algorithm, it will accumulate a large number of errors and increase the follow-up algorithm for computing time and complexity that missing data handled properly. It can deal with missing data by using C4.5 algorithm, though it still not perfect.Bayesian is one kind of method of posteriori probability obtained from priori probability according to new information. Bayesian theory is made use of to generate the prior distribution and posterior distribution of complete data set according to the accuracy test of entire data set in this thesis, to get the accuracy predictive value on the basis of the statistical inference, and build a Bayesian model of filling missing data, and put forward an improved algorithm C4.5—algorithm BC1.0. This thesis chooses the data sets in UCI for testing, in all loss ratio cases, the filling accuracy of BC1.0 is better than C4.5 algorithm, meanwhile,the efficiency and accuracy of BC1.0 algorithm is analyzed and compared in this thesis.This thesis achieves the application study of BC1.0 algorithm in the employment of college students on this basis. The full mining process for the analysis of student employment using decision tree technology is described in specific applications. Data cleaning, data conversion, data reduction and other data pre-processing work have been done before mining. And further understanding of data pre-processing technology has been achieved through the concrete application. After mining the employment data of college students, some potential rules and hidden patterns of student employment can be identified and found, so as to provide the guidance on employment decisions and to improve the reform of the employment system and to promote the employment of college students.
Keywords/Search Tags:data mining, decision tree algorithm, C4.5 algorithm, Bayesian model, university and college employment
PDF Full Text Request
Related items