| The study of undergraduates’ emotions plays an important role in campus management.Being depressed for a long time has serious effects on study,life and health.Every year,freshmen are required to test their mental health level by taking questionnaires at the time of enrollment.However,the questionnaires have shortcomings of low detection efficiency,inaccurate measurement and low timeliness.Although there are studies employing undergraduates’ activity data generated in campus to conduct educational data mining,there is no research on emotion at present.Therefore,this thesis proposes a method to analyze the degree of students’ negative emotions based on non-intrusive data generated by college students.The method is divided into three parts: constructing college students’ emotion-activity data set,extracting and screening emotion-related features and training regression model.In this thesis,the students’ emotional questionnaire results are combined with the network log,consumption data and water data to construct the emotion-activity data set.In order to build a high-quality data set,the relevant data needs to be preprocessed.The work of data preprocessing includes: in the questionnaire data,in order to ensure the quality of the questionnaire survey,this thesis developed a set of questionnaire screening rules.After screening,11610 valid questionnaires are obtained.In the network log data,in order to complete the category information of websites visited by the network log URL,this thesis uses the Naive Bayesian model to train a classifier based on n-gram for the classification of URLs.The experimental results show that the classifier has a good effect and the accuracy of the classifier reaches 78%.Compared with the original log data,the category information of the website accessed by the complete URL can better reflect the undergraduates’ online activities.To represent undergraduates’ behavior activities,this thesis extracts network features,consumption features and hot water usage features from the emotion-activity data set to reflect students’ activities.According to the correlation analysis,the most relevant characteristics are the proportion of students visiting the medical and health websites within the prescribed time,the regularity of hot water use,the earliest time and the average time for breakfast.Finally,the selected activity characteristics are combined with the scores of the emotional questionnaire to train the regression model.The regression tree model is proved to possess the best outcome on the emotion-activity data set with NDCG,the evaluation index of the recommended model,reaching 0.808 for all students’ emotional state.For students in depression(in the top20% emotional state ranking),NDCG reaches 0.812.As a whole,the experimental results show that the method can effectively detect the students with low mood,which can help related staff to find and focus on students in depression timely and provide a decent guidance for psychological counseling as reference. |