Since MOOC was widely popularized in 2012, people’s way of getting knowledge has been greatly expanded. However, due to its openness,despite the large number of users, the MOOC dropout rate has maintained high, even up to 90%, which makes the MOOC dropout prediction a difficult but valuable topic. This study on this topic aims to help teachers get to know the realtime situation of their courses so to change their teaching strategy if it is necessary. Besides, the computer model of learner behavior provides MOOC platform with valuable learner information. The precise prediction on dropout will be beneficial to supervising MOOC user’s state. The realization of model on MOOC user will reduce the cost of manpower greatly and benefit the development of MOOC platform.Based on the competition data of KDD CUP 2015 and existing work during the competition, this thesis conducts further work on feature engineering, classifier training and prediction, ensemble learning, and the application of deep learning algorithms. A new series of powerful feature is extracted and improvement of gradient boosting decision tree are applied.After the improvement, the model’s AUC of ROC increases from 0.887 to 0.9014. There is only a slight difference of 0.6% between the first rank in KDD CUP 2015 and the final performance of this thesis’s model, which is roughly ranked tenth in all 821 teams.The main contributions of this thesis are the followings:(1) Conduct study on key aspects of feature engineering in MOOC dropout prediction. The thesis explores all kinds of features in detail and lists not only effective features, also bad features and the corresponding reasons. Besides, qualitative analysis of the failure of χ2 test and F-Score in determining threshold of feature is conducted.(2) Propose a new model fusion method called Ada-Gradient Boosting.This method not only shuts the step of artificial segmentation of training data, to avoid overfitting, during the model fusion, but also avoids the tedious steps of the need for artificial collection of base classifiers’prediction to train the integrated model again. With the advantage of AdaBoost algorithm, all the training processes can be automated and improve the data utilization rate. The best performance of ensemble model is thus obtained.(3) A novel loss function for forward addition model is-propesed. In the process of trying to explain the effect of the intergrated Ada-Gradient Boosting model, borrowing multi-task learning thoughts, the thesis creatively puts forward a new combination loss function (Combination Loss). This loss function enables the best performance of a single model.(4) New features are put forward in order to keep the information of user logs as complete as possible. To provide the conditions in which the application of deep learning algorithms are applied to this topic, the thesis reextracts the quantitative features of the user logs and keeps the information loss as least as possible. |