Font Size: a A A

Analysis Of University Students' Behavior Based On Big Data

Posted on:2021-03-25Degree:MasterType:Thesis
Country:ChinaCandidate:Alrobassy Hala Shawky Ahmed NoFull Text:PDF
GTID:2428330623483977Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the continuous development of information technology and continuous improvement of infrastructure,big data technology has been widely used in various industries,such as medical,education,catering,logistics,automotive,finance and entertainment,to make life more convenient.In universities,with the use of information technology,a large amount of data is generated.Among them,the daily life and learning behavior data of college students have aroused the attention of the majority of researchers and the high attention of university administrators and also provided a key big data environment for the realization of digital campuses.By classifying and analyzing these data,you can obtain the behavioral rules and characteristics of students,and provide managers with reference value on how to manage students,thereby helping to better and effectively manage students.In this research work,student data from Lanzhou University of Technology which includes student books borrowing,campus card consumption,two-year grades,and student professional records are selected as the data source to be investigated.Using RapidMiner data architecture framework,the data set is pre_processed and different data sources are integrated to obtain a set of student behavior data for analysis.The major work carried out is as follows:(1)Use the FP-growth algorithm to mine students' academic performance,the number of books borrowed,the correlation between different majors and campus card consumption to predict student behavior.Statistical analysis was also performed using the Python Pandas software package to ensure data balance and detect and handle any outliers.(2)By using the K-means algorithm to cluster student data,the relationship between different students 'academic performance,book loan data and campus card consumption data,and the behavior differences among different students are mined according to the clustering results.The elbow method is used to determine the optimal cluster number of K-means.(3)To predict student performance for the next year using historical behavior data,Naive Bayes(NB),Support Vector Machine(SVM),Random Forest(RF),and Neural Network(NN)classification algorithm was implemented using tuning parameters for optimal performance.To prevent overfiting of the model,10-Fold Cross Validation technique is applied to vary the set of data used as training and test data.Furthermore,in this study,an Integrated classification model is proposed by combining the features of Support vector machine(SVM),Random Forest(RF),Na?ve Bayes(NB)and Neural Networks(NN)to predict the performance of students based on their behaviour.A key part of the model is the SoftMax function and a categorical cross-entropy loss function which forms part of the third layer of the neural network.Also,the class weight which used to balance the class label.(4)Finally,each model performance is evaluated using various evaluation metrics.The proposed model performed best with micro average of the ROC curve as 92%,and the macro average as 86%.The accuracy value of the Random forest(RF)algorithm was 75%,Support Vector Machine(SVM)was 76%,Na?ve Bayes(NB)had 74% accuracy,neural networks(NN)performed at 78% accuracy,while our proposed integrated model had a performance score of 85%.The proposed classification algorithm was able to correctly classify students' performance across majors far better than the traditional algorithms we experimented with.
Keywords/Search Tags:Big data, Association rules, Cluster analysis, Classification techniques, Student behavior
PDF Full Text Request
Related items