Font Size: a A A

Based On The Design Of Decision Tree Mining Model Of Education (dt-eidm)

Posted on:2006-08-07Degree:MasterType:Thesis
Country:ChinaCandidate:T WeiFull Text:PDF
GTID:2208360182956380Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
There have been a lot of data accumulated by universities and colleges about pedagogic management.But till now. these data have not been utilized effectively. Data Mining technique could be used to mine vast data and get some valuable information behind the data ,and has been wildly applied in more and more fields and get excellent efficiency which could provide valuable information for accurate decision making. To utilize the data accumulated by universities about pedagogic management effectively, Classification Rules was researched and ID3 algorithm was improved according to the characteristic of education information. Education Information Data Mining Model(DT-EIDM) is designed and realized according to practical requirement on the base of improved ID3 algorithm.Decision Tree Learning Algorithm has played a important role in Data Mining Techniques.ID3 algorithm, which is one of the Decision Tree Learning algorithms, is firstly researched ,and three aspects of its disadvantage and deficiency are:(1)Single attribute is chosen at every level of nodes within a decision tree and correlation between attributes has not been taken into account sufficiently which leads to some attributes being chosen for more than once.(2)ln the course of creating a decision tree,Some dataset would be too small to be partitioned . Thus making further partition would be meaningless.(3)ID3 algorithm tends to choose attribute with many values.Combined with the characteristics of pedagogic management information, ID3 algorithm has been improved to deal with the three shortages mentioned above and a new algorithm called EIDT-DM algorithm is designed based on the improved ID3 algorithm.EIDT-DM algorithm has the following improvements:(1)Correlation degree concept is introduced in EIDT-DM algothrim to have all no-classified attributes being analized according correlation degree,and correlation degree threshold is set to delete the attributes which have the correlation degree values below the default threshold.(2)classification threshold is set to avoid portioning the mined dataset repeatedly.Any dataset with items number less than the classification threshold would not be partitioned any more,instead, a leaf-node iscreated and the leaf-node would have the classification attribute value which has the largest number of items in the dataset.(3)Complex measure standard is taken to replace information gain to be standard of choosing attribute in EIDT-DM algorithm.Based on the EIDT-DM algorithm,Education Information Data Mining Model (DT-EIDM)has been designed and implemented. In the course of implementing DT-EIDM ,JAVA has been chosen as the development language and ORACLE9i has been used to create Education Information Mining Database to store the mined data.Dataset from different database systems has been transferred into the Mining Database in Oracle9i. classification rules such as correlation between curricula and the influence of students' basic information on career choice after graduation could be got by mining data in Education Information Mining Database. Mined rules can be showed in different ways: IF-THEN rule patterns, Decision Tree display.Wei Tao(Computer Software and Theory) Directed by Prof. Zhou Guangsheng...
Keywords/Search Tags:Data Mining, Decision Tree, ID3 Algorithm, Education Information Mining
PDF Full Text Request
Related items