Font Size: a A A

Research And Application Of Improved Random Forest Algorithms

Posted on:2020-10-07Degree:MasterType:Thesis
Country:ChinaCandidate:Q H ZhuangFull Text:PDF
GTID:2428330590962939Subject:Engineering
Abstract/Summary:PDF Full Text Request
In the teaching management of colleges and universities,students' curriculum performance is an important basis for evaluating teaching quality,and many factors may affect students' performance.Data mining tools are used to predict and analyze students' academic performance,and then the predictive analysis results are used to correct students' bad learning behavior in time,and to check the teaching effect of teachers.In order to predict the course performance of college students and analyze the characteristic silver which has a significant impact on the students' performance,the author analyzed the random forest algorithm in this paper,and proposed the corresponding improved random forest algorithm IRFC,which has a high classification accuracy.At the same time,in order to improve the execution efficiency of the algorithm,the parallel work of the improved random forest algorithm is carried out on the large data platform to shorten the execution time of the algorithm.In this paper,a hybrid algorithm IRFC,which combines simulated annealing algorithm with stochastic forest algorithm for feature selection,parameter optimization and weight setting,is proposed to optimize the performance of stochastic forest algorithm in an all-round way.In the aspect of feature selection,parameter optimization and weight setting,by analyzing the research status of Stochastic Forest algorithm,the binary coding of related parameters(number of features,tree size,tree decision weight)is taken as the objective variable of algorithm optimization,and the OOB out-of-pocket error is taken as the objective function of optimization,and then the IRFC algorithm in this paper is proposed.The author collects history students' behavior records and subject scores(three years' data)to construct data sets,so as to predict students' performance based on students' behavior information.At present,the experimental verification of the improved random forest algorithm on this data set and UCI heart disease data set has been completed.The experimental results show that the proposed algorithm has higher generalization ability and smaller OOB error.In addition,we also focus on the parameter impact analysis of the improved stochastic forest algorithm on the student performance data set.The validation results confirm that the simulated annealing algorithm,feature selection process,weighted weight optimization and other processes are helpful to the effectiveness of the algorithm.In addition,we can identify the characteristic factors that have significant impact on the student curriculum performance,which is helpful to the teaching reform.
Keywords/Search Tags:Educational Data Mining, Improvement Random Forest Algorithms, Simulated Annealing Algorithms, Performance Forecast, Student Achievement Dataset
PDF Full Text Request
Related items