Font Size: a A A

Research And Implementation Of Teaching Information Mining System Based On Decision Tree

Posted on:2017-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:J HeFull Text:PDF
GTID:2308330485992509Subject:Software engineering
Abstract/Summary:PDF Full Text Request
At the information era, information technology has become an indispensable part of life. With the application of information technology in-depth and wide popularity, it generated a lot of data needs to be stored and reused, which leads to a new "era of big data." Faced with such a huge data, there have been data mining technology, which is to find the connection between data and hidden secrets in internal data.Data Mining started late in China, but progress rapidly. Some large enterprises, high-tech companies have done a more in-depth study and preliminary business use, and achieved a lot.In the background of big data and data mining, primary schools and other educational institutions at the stage of compulsory education are somewhat lagging behind. I spend much time updating a large number of hardware equipment each year, and rarely to analyze the data, not to mention data mining.If the modern schools want to develop, they must have the new technology revolution. Relying on the traditional mode of education has lagged behind, and the emergence of positive data mining technology can be a powerful engine of change in modern education.Firstly, this thesis starts from the background of the big data and the development of data mining technique.I study the three commonly used algorithm works of data mining techniques ID3, C4.5, CART and the comparative analysis of their advantages and disadvantages to demonstrate that the application of information systems technical feasibility in teaching. According to the status where the school can use data mining techniques to advance the quality of school.Secondly, this thesis analyzes the data characteristics of the school. By the amount of data is not too large in general, and the eigenvalues are not too many cases, I extract a small amount of sample data ID3, C4.5 to make a comparison combined with data mining techniques. In the case of generating similar results, I decide to use the Id3 which the relative complexity of the algorithm is not high to build the school’s decision-making tree model. But Id3 algorithm itself also has disadvantages: it is characterized by information gain selection when the general election is more than the value of the property as a segmentation characteristic features, but these features are not necessarily the best choice. In addition, when the information gain calculations too complicated, and when the sample is large, the calculation will bring no small burden. For such problems, this thesis makes a minor improvement to Id3 algorithm. Designed to favor the use of equivalent infinitesimal problems and user interest rate is calculated to simplify the concept and features of value.The new algorithm ID3 algorithm by comparing the actual sample test improved to achieve the characteristic values reported in the election the same situation, calculate the information gain quickly, and characteristic values selected in line with expectations.Then, I use B / s structure of the system design for the characteristics of the school, database mysql, programming language python. System is divided into landing, data input, student management, class management, data preprocessing, decision-making tree production module. Functions and structure of each module are analyzed in detail. Finally, I choose the sample of two classes for three years students which enrolled in 2012 including sample of students enrollment, final grade in Grade7, the final grade in Grade8, the results in High school entrance examination in Grade9,the generalization data processing in class management, the teachers equipment in Grade9, promoting and enhancing the slow learners and helping fast learners. Then I can get through Id3 decision‐making tree algorithm improved by presentation of the calculation process. It can be seen that the algorithm can improve the speed of decision‐making tree generation, avoiding the bias problem of the feature selection.
Keywords/Search Tags:data mining, ID3, decision-making tree
PDF Full Text Request
Related items