Research And Application Of Decision Tree Algorithm In Employment Information Warehouse Of Graduate Student

Posted on:2011-04-14

Degree:Master

Type:Thesis

Country:China

Candidate:Q Wang

Full Text:PDF

GTID:2178360305983009

Subject:Information management and information systems

Abstract/Summary:

PDF Full Text Request

Data mining is the results of rapid development of information technology and the diversify of people's access to data means.It is the process of extracting implicit and potentially useful information and knowledge from the large amounts of data. The main task of data mining includes correlation analysis, cluster analysis, classification, prediction, time-series pattern, deviation analysis and so on. In the process of mining, data classification is an important content in the research of data mining. At present, there are many methods used for data classification, such as decision trees, neural networks, k-nn method, rough sets, and statistical models and so on. Among them, the decision tree algorithm is the most common method in classification algorithm. It has the advantages of rapid calculation speed, easily understood and easily converted into the classification rules Therefore, it is widely used in medical diagnosis, weather reports, credit audits, business prediction, cases detection and other fields.There are many deficiency in the existing decision tree algorithms, such as multi-valued bias of attribute selection, the treatment of property vacancy value, the treatment of property continuous value and so on. Therefore, it has important theoretical and practical significance about how to further improve the performance of decision trees, improve their classification accuracy, and make it more suitable for the application requirements of data mining. In this paper, the author has made further research about the deficiency of the above-mentioned decision tree, explored the optimization algorithm of the decision tree classification algorithm, as well as how to use the decision tree method to classify and mine the data warehouse of graduate students. In this paper, the author's main research works are as follows:First, the author has described the theoretical basis of data mining and classification techniques, and the basic knowledge of the decision tree, as well as mainly analyzed and compared some common decision tree algorithm, such as the classical decision tree algorithms-ID3 algorithm, C4.5 algorithm which is able to overcome the value bias problem of ID3 algorithm properties, CART algorithm using gini coefficient as an attribute selection criteria, and SLIQ algorithm with good expansion and parallelism.Second, the author has analyzed the following issues in details, such as the vacancy of property value, the bias of property continuous value, the treatment of continuous-valued property, the reduction of property, and the standard of property selection in the existed decision tree algorithm, put forward some concrete solutions as well.Third, according to the characteristics of information database in college graduate students, the author has extracted, transformed, and loaded the heterogeneous data sources, built up the post-graduate employment data warehouse for the classification mining.Fourth, the author has improved the ID3 algorithm and put forward a new decision tree algorithm based on users'interest degree and simplified information entropy. By comparison, the new algorithm is better than traditional ID3 algorithm in the overall performance. The improved algorithm is applied to employment information database in college graduate students, provides decision support for the career center of university, and effectively plays its role in the practical application value.

Keywords/Search Tags:

Data mining, Decision tree, ID3 algorithm, Entropy of simplification, User interest rate

PDF Full Text Request

Related items

1	Research And Application On The Data Mining Algorithm Based On Decision Tree
2	Research Of Classification Algorithms Based On Decision Tree
3	Research On Technology Of Data Mining Algorithm Based On Decision Tree
4	Research And Design Of The Off-network User Analyzer Of Mobile Telecommunications On The Basis Of Decision Tree
5	The Research On The Algorithms Of Optimizing Decision Tree Classification
6	Improvement And Application Of Decision Tree With Covariance&Information Entropy
7	Research And Application On Decision Tree In Data Mining
8	Research On Attribute Reduction Algorithm Based On Decision Tree And Information Entropy
9	Research On ID3 Algorithm And Application In The Data Mining System Of The Government Policy-making
10	Research Of Classification Algorithms Based On Decision Tree