Research And Construction On Performance Management System Based On Hadoop

Posted on:2016-08-24

Degree:Master

Type:Thesis

Country:China

Candidate:M J Tian

Full Text:PDF

GTID:2348330488474250

Subject:Communication and Information System

Abstract/Summary:

PDF Full Text Request

Test results, as an important basis to test students' mastery of the knowledge and to evaluate the teaching quality, has a lot of information to be excavated. So the management and analysis of performance data has always been an important part of the examination process. The traditional way of achievement management is to keep the students' record in their own school. The shortcoming of this method is that only to understand the situation of their own schools, and can only carry out simple analysis and query operation. It would be of great benefit to establish a management system based on the National Students' academic performance.The performance management system based on Hadoop is such a platform. It is mainly for the primary and middle school students who have entrance pressure, to provide two aspects of the query function: on the one hand, through the traditional performance ranking and the average score students can learn about their level and position in the country, so as to choose to fill in their own education. On the other hand, through the clustering analysis results of examination results, an association between subjects can be found and people can clearly understand their own strengths and weaknesses, so as to improve their own accordingly. By using this system, we can also find the relationship between the curriculum through the cluster analysis, so that we can arrange the course reasonably. In addition, for the protection of personal privacy, the system only shows the performance data and hide the relevant student information.In this thesis, we first analyze the characteristics of Hadoop platform and HBase database, and analyze the underlying principle and operating mechanism of HDFS, Map Reduce, and contrast the difference between HBase distributed database and traditional database. Then combined with the requirements of the performance management platform, the system is designed and implemented. Then describes the design of the system framework, from the vertical view, the system architecture from top to bottom can be divided into user layer, analysis layer, computing layer and storage layer. From the horizontal view, the system function module can be divided into user registration and login module, the results of the ranking query module, the average score query module and clustering analysis module. The function of each layer is introduced in detail, and the data table for each functional module are designed and implemented by encoding.In the research of the clustering algorithm, this thesis first carries on the study of the clustering algorithm, then focuses on the K-Means algorithm and the improved PSO-K-Means algorithm, and realizes these two algorithms on Hadoop. The clustering analysis results of data are compared with the results of the two algorithms. this thesis uses K-Means clustering algorithm to carry on the deep level of performance data analysis. The selection of initial points of K-Means algorithm is improved, and then the results are analyzed by using this algorithm. At last, this thesis introduces the construction of Hadoop experiment environment, and then realizes the function of each module to get the final results, and analyzes the Hadoop processing large-scale data.

Keywords/Search Tags:

Hadoop, HBase, performance management, K-Means, PSO-K-Means

PDF Full Text Request

Related items

1	Parallel Clustering Algorithm’s Study And Application Based On HBASE
2	Performance Improvement Of K-means Algorithm And Its Application In Movie Recommender System
3	K-Means Algorithm Design And Implementation Based On Hadoop And Mahout
4	A Research And Implementation With Improved K-Means Clustering Algorithm To Image Retrieval System Based On Hadoop Platform
5	Research On Machine Learning Clustering Algorithms In The Hadoop Development Environment
6	Research And Application Of Clustering In Telecom Customer Differentiated Reminder Based On Hadoop
7	The Research On Parallel Computing Technology In Precise Agricultural Climate Division
8	Research On Hot Topics Discovery In Microblog Based On Distributed K-means Algorithms
9	Research On K-Means Clustering Algorithm Based On Hadoop Cloud Computing Platform
10	Design And Development Of E-commerce System Based On Cloud Computing Technology