Font Size: a A A

Graph Convolutional Networks:An Application To Open Educational Resources

Posted on:2021-04-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y X XuFull Text:PDF
GTID:2427330611968000Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Distance education refers to a new teaching mode that uses television and Internet and other communication media.It breaks through the boundaries of time and space,and students can take classes anytime and anywhere.Admission requirements are not limited by age and education,providing students with opportunities to improve their education.Large-scale online education curriculum MOOC is a model of distance education,which integrates teaching resources through the Internet.2012 was the first year of MOOC development.The top American universities took the lead in establishing online education websites.This emerging method of distance education was quickly recognized by the majority of learners and universities,and the number of registered users grew rapidly.The growth of the number of users is both an opportunity and a challenge for its development.On the one hand,MOOC organizations can find a market position and sustainable income model,on the other hand,a large number of users cannot learn well on the Internet.According to 2019 Harvard MOOC research report,58% of students intend to obtain skills qualification certificates online,and only 6% of students who have completed registration and can obtain certificates.This situation indicates that a large number of students are unable to successfully complete their academic assessments and even give up their studies after registration.It has aroused public doubts about the quality of its teaching,which will directly affect the social credibility of this emerging distance education.At the beginning of 2020,new coronaviruses are spreading all over the world,and many countries have closed schools and opened distance learning methods for teaching.Students are the pillars of future motherland construction.How to ensure that they can obtain learning effects similar to traditional classroom education is a hot topic.In order to solve the problems of low completion rate and lack of effective early warning mechanism for middle school students in distance education,this paper establishes a learning conjecture model based on graph convolutional neural network on the MOOC dataset.Through this model,students who are at risk of failure and dropout in the distance education platform can be found early.Until now,the research fields of student performance prediction can be divided into three categories according to the different research models,including the use of traditional machine learning models,multilayer perceptron models and convolutional neural network models.The data available for analysis on distance education platforms are generally divided into three categories: personal information when students register,course information,and user behavior information on the virtual interaction system(VLE).The author points out that the disadvantages of traditional machine learning models and multilayer perceptron models are that the models need to perform a lot of feature engineering work on existing data sets.Not only does this process take a lot of time,the results achieved are also related to how familiar the researchers are with the data set.This means that the model is only valid on a specific data set and cannot be generalized to work on other data sets.To solve this problem,convolutional neural networks with automatic data extraction features have also been practiced in this area.Wang proposed a Con Rec Network based on convolutional neural networks and recurrent neural networks to predict student dropouts,and achieved a prediction effect comparable to that of non-automatic feature extraction models.The disadvantage of this model is that the user can only be calculated as a separate individual,and it takes a lot of time during the training process.In addition,the CNN network model cannot organically combine user data and user relationship graphs.For example,when CNN is applied to any graph(such as a social network)instead of a regular network structure,the usual convolution operation is not applicable.Because the number of neighbors and the topology of each node in the graph are different,it is difficult to perform a fixed-size filter scan on the data on the graph to extract features.In order to solve the problem of graph convolution,the existing graph convolution model is introduced,and three different graph convolution formulas are derived.At present,few studies have considered the use of graph convolution in the study of learning prediction.In this paper,the problem of student learning prediction is considered as a node classification problem on the graph,which is divided into three categories: pass,fail,and dropout.Each student can be regarded as a node on the graph.The student's personal information and learning behavior data can be regarded as node attributes,and the edge is a kind of social relationship between the students.The proposed graph convolution model not only focuses on how to improve the accuracy of node classification,but also emphasizes the practical application after prediction.By predicting and finding high-risk students,teachers can accurately locate high-risk students through the user relationship graph,which is more conducive to quickly finding loopholes in teaching.In the implementation of this process,there are two main research difficulties: First,how to correctly define user relationships and calculate the adjacency matrix.The second is how to organically combine the features of the user with the features of the user relationship graph to achieve better prediction results.In view of the above two difficulties,this paper introduces the application of the proposed method on the Open University Learning Analysis Data Set(OULAD).The Open University Learning Analytics dataset contains 22 courses and millions of records generated by more than 32,000 students.The author's main contributions and innovations in this data set are threefold.The first is to propose a method for converting discrete data and time series into a vector in the data set,and visually analyze and filter user characteristics.In this step,mil ions of time series of student behavior records in the VLE system are divided into blocks by time period.In each time period,twenty different types of interaction behavior information of each user are counted,such as the sum of clicks in a certain time period.Based on the division of the time period,the One-hot algorithm is used to transform the discrete data into a normalized feature matrix that conforms to the model.The second is to introduce the basic knowledge of graphs,and use the decision tree-based data mining algorithm to define user relationships.The goal of using a decision tree here is not to predict,but to automatically obtain the classification rules of the nodes.We define the nodes under the same classification rule as the same class,and use random connection to connect the nodes within the class.Interconnected nodes mean that there is some kind of social relationship(such as having the same degree,equivalent length of study...).After the node relationship definition is completed,the user's adjacency matrix is calculated to represent the structural information of the graph.The third is to transform the prediction problem of student completion into a node classification problem on the graph,and a prediction model based on the graph convolution algorithm is proposed.Compared to traditional deep learning models,graph convolution models have different requirements for input data.The input of the graph convolution model consists of two parts: the data feature matrix and the adjacency matrix representing the user relationship graph.In addition,the proposed graph convolution algorithm uses Chebyshev polynomials to decompose matrices and convert complex eigenvalue calculations into matrix multiplications,which greatly reduces the computational complexity.In order to test the performance of the proposed model performance and verify the effectiveness of the proposed method,five different experiments were designed and the tools in scikit-learn were used to evaluate the model performance.In Experiment 1,the prediction effects of the proposed model and the similar model are compared.Compared with the four existing non-graph convolution prediction models,the graph convolution model has significantly improved in accuracy,recall and F1-score.Compared with the same type of graph convolution model,the accuracy of the proposed model is improved,and it has obvious advantages in time cost.In Experiment 2,the comparison of different user adjacency matrix pairs in graph convolutional neural networks is compared.Model influence.Using the user adjacency matrix proposed in the paper,the best prediction results are obtained in all graph convolution models.This experimental result shows that the proposed decision tree-based data mining method defines appropriate user relationships;In Experiments 3 and 4,the relevant parameters that affect the experimental results are analyzed.Considering the balance between time overhead and prediction performance,the better model parameters were selected.In Experiment 5,the results of the three-node classification problem were classified and evaluated.The proposed model has a prediction accuracy rate and recall rate of 98% for dropout students,and the prediction accuracy rate of passing and failing tests has reached over 80%.Those five experiments designed in this paper verify the feasibility and effectiveness of the author's contribution and innovation.The proposed graph convolutional neural network model combines the knowledge of graph theory,which makes up for the shortcomings of existing models,and accurately predicts high-risk students in the distance education platform.The innovative application of deep learning on the graph has significant significance for the development of distance education and the improvement of teaching quality.
Keywords/Search Tags:Graph convolutional neural network, Distance education, Learning prediction, Data mining, Adjacency matrix
PDF Full Text Request
Related items