Font Size: a A A

Application Research Of Data Mining Algorithm In College Student Activity Analysis

Posted on:2021-12-28Degree:MasterType:Thesis
Country:ChinaCandidate:S Y PangFull Text:PDF
GTID:2507306560453014Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the construction of a smart campus,all extracurricular activities and club activities that college students participate in the second-class have been recorded.How to effectively use the students’ second-class data and analyze the student activity behaviors in conjunction with student-related behavior data helping schools to make corresponding behavioral decisions and recommend relevant guidance to student activities has gradually become a research hotspot.Based on the second-class data of colleges,this thesis performs data preprocessing and clustering,and combines the first-class data and employment destination data to analyze from multiple dimensions.The main tasks are as follows:(1)Aiming at the disadvantages of the uncertainty of the initial clustering number of the K-means algorithm and the random selection of initial clustering centers,the Mean Shift-K-means algorithm is proposed.The algorithm determines the optimal number of initial clusters by evaluating the index BWP;adding a kernel function to the Mean Shift vector to add a weight coefficient to each sample to eliminate the effect of distance;determine the best initial clustering center based on weight and density.This algorithm is compared with other six improve algorithms on public data sets and analyzed.The results show that the algorithm proposed in this thesis can get better clustering effect.(2)The Mean Shift-K-means algorithm proposed in this thesis is used to analyze and mine college student activities.First,pre-process the acquired second-class data,first-class data,and employment destination data of a certain university to eliminate the interference of noisy data,delete redundant data and improve the missing information,and integrate the various data obtained,and de-identification.In the comparison experiment,the Silhouette Coefficient and Calinski-Harabaz(CH)are used as evaluation indicators.The larger the two indicators are,the better the clustering effect is.The results showed that the Mean Shift-K-means algorithm’s Silhouette Coefficient value is significantly better than the other six improve algorithms,and the CH value is much higher than the other six improve algorithms,indicating that the algorithm has better clustering effect,and it has good applicability.(3)Analyze and display the clustering results of college student activities.Analyze from the following four aspects: the scores of the second-class of colleges and universities,the results of single subjects in the second-class of colleges and universities,the scores of the first-class of colleges and universities,and the employment of college students.It is concluded that the second-class of colleges and universities has a positive role in promoting the first-class;the employment orientation of students is closely related to the types of second-class activities that students participate in;the total number of college students participating in the second-class activities is relatively small,and students’ enthusiasm should be increased.
Keywords/Search Tags:the second-class data of colleges, MeanShift algorithm, K-means algorithm, activity recommendation, employment guidance
PDF Full Text Request
Related items