Font Size: a A A

The Design And Realization Of Education Information Mining Model

Posted on:2008-11-03Degree:MasterType:Thesis
Country:ChinaCandidate:J C ZhangFull Text:PDF
GTID:2178360242972490Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
There are lots of data accumulated by universities and colleges during pedagogic management. Data Mining technology could be used to mine vast data and get valuable information behind the data. It has been widely applied to more and more fields, which could provide valuable information for accurate decision making. To utilize the data accumulated by universities about pedagogic management effectively, Classification Rules was researched and ID3 algorithm was improved according to the characteristics of education information. Education Information Data Mining Model (DT-DT-IDM) is designed and realized according to practical requirement based on improved ID3 algorithm.Decision Tree Learning Algorithm has played an important role in Data Mining Technology.ID3 algorithm has three shortages: (1) Only single attribute could be chosen in each node of a decision tree. Correlation between attributes has not been emphasized sufficiently which leads to some attributes being chosen duplicatedly; (2) During creating a decision tree, some dataset would be too small to be partitioned recursively. Thus making further partition would be meaningless.(3)ID3 algorithm tends to choose attribute with many values.Combined with the characteristics of pedagogic management information, ID3 algorithm has been improved to make up the three shortages mentioned above and a new algorithm called IDT-DM algorithm is designed based on the improved ID3 algorithm. IDT-DM algorithm has the following improvements:(1)Correlation degree concept is introduced in IDT-DM algorithm to have all no-classified attributes being analyzed according to correlation degree, and correlation degree threshold is set to delete the attributes which have the correlation degree values below the default threshold.(2)Classification threshold is set to avoid portioning the mined dataset repeatedly. Any dataset with items number less than the classification threshold would not be partitioned any more, instead, a leaf-node is created. (3)Complex measure standard is taken to replace of information gain to be new standard of choosing attribute in IDT-DM algorithm.Based on optimized the IDT-DM algorithm, Education Information Data Mining Model (DT-IDM) has been designed and implemented. In the course of implementing DT-IDM, VISUAL C++ has been chosen as the development language and SQL Server2000 has been used to create Education Information Mining Database to store the mined data. Classification rules such as correlation between curricula and the influence of students' basic information on career choice after graduation could be gotten by mining data in Education Information Mining Database. Mined rules can be explained and evaluated clearly by using form and graph.
Keywords/Search Tags:Data Mining, Decision Tree, Classification technique, ID3 Algorithm, Database, Education information
PDF Full Text Request
Related items