Font Size: a A A

College Big Data Analysis And Mining Based On MapReduce

Posted on:2017-01-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y N SiFull Text:PDF
GTID:2348330488467341Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The construction of college network has provided advanced information technology and composite teaching environment for the school's teaching and research.Digital campus what based on college network has been constantly improved.Colleges and universities gradually implement fully digitized in environment,resources and applications.By integration these data we can scientific and standardized management the campus' s data.On the basis of digital campus utilize of cloud computing,IoT(Internet of things)and big data technology construct the wisdom campus to let the students' studying and living more intelligent.In this processing,data whitch generated by all kinds of applications system is also expansion rapidly.It has initially formed a big data environment in colleges and universities.These data of campus contained a wealth of information,from which you can get a lot of knowledge.The scale of these data was very huge and grew sustainedly.So it is essential to use the new data storage and analysis tools,to make fully use the benefits and effects of the information technology.In this paper,firstly we use the mainstream big data processing platform Hadoop2.0 to analyze the big data of campus by using its storage techniques and methods.We chose this version to overcome some defects in previous,for example the poor scalability of the file system,the low utilization of resource and the single computing framework.Then we proposed the threshold for minimum support in the Apriori algorithm,in order to overcome its high-consuming and low efficiency problem.Introduction theimproved algorithms into MapReduce computation model,and verify the effectiveness of the improved algorithm by using the student achievement module.Secondly,in order to overcome excessive faragmentation and overfitting problem in C4.5,the cross-chunked C4.5 algorithm based on MapReduce was proposed.Mean while verify the feasibility of this algorithm at student scholarship classification module.Finally,we analysis and verify the rationality of the proposed algorithm by different evaluation criterion.Experimental results show that the proposed method is effective and provided a new research mentality for the big data mining of college,and also provided technical reference for the buildingof wisdom campus.
Keywords/Search Tags:Big Data, Hadoop2.0, Apriori, MapReduce, C4.5
PDF Full Text Request
Related items