Font Size: a A A

Big Data Visualization Analysis Web System Based On Hadoop And Django

Posted on:2017-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y WuFull Text:PDF
GTID:2308330503978318Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
Development of the Internet leads to the exponential growth of data.Nowadays, human have entered the era of big data.It becomes very important to find valuable information from the mass data.One of the main tasks in data visualization filed is to turn abstract and complex data into easy-to-understand information.Today data visualization becomes a hot spot in big data field,which is important for both research and applications.In this paper,the open source Django Web framework was used to build data visualization system based on the Hadoop platform in an actual project.A system solution was proposed in some module designs. Optimizing and improving the system from several aspects such as the front-end optimization, Django application optimization, PostgreSQL database optimization and so on,made the improvement in Web system optimization to solve some problems in alpha test.The Web system data was attained from the Hadoop computing platform,which scheduled thousands of data processing tasks.The scheduling algorithm will directly affect the efficiency of data processing. The scheduling algorithms were studied.First of all,there is a depth analysis on principle,advantages and disadvantages of three common job scheduling algorithms:FIFO scheduling algorithm, the fair scheduling algorithm and the capacity scheduling algorithm.Then the delay scheduling algorithm with "mobile computing" thinking was analyzed and a new scheduling algorithm based on it which takes into account node load was proposed.The simulation result proved that the improved delay scheduling algorithm increases efficiency of job scheduling.Finally analyzing the system architecture and finding that the responsibilities of front-end and back-end are not clear.There are interference between the code of front-end and back-end and it will be more complex with the system extension.By using the NodeJS as middleware according to the taobao’s solution, removing the interconnection between front-end and back-end and improving development efficiency and reducing system maintenance costs.
Keywords/Search Tags:Hadoop, Django, Big data, visualization, job scheduling algorithm
PDF Full Text Request
Related items