Font Size: a A A

Design And Implementaion Of Distributed Stream Computing Node Management System

Posted on:2014-01-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y SongFull Text:PDF
GTID:2248330398472378Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The fast growth of information technology promotes the extensive growth of the data traffic. These data traffic include not only structured data in database, but also unstructured data from email, sensor, online video and so on. In the face of massive real time data traffic, traditional calculation model cannot satisfy the requirement. The research of how to process massive unstructured real time data becomes focal point and this accelerates the explosive development of distributed stream computing frame.InfoSphere Steams of IBM and S4(Distributed Stream Computing Platform)[1] of Yahoo! are typical stream computing frames. InfoSphere Steams is a mature stream computing product. However, it is not open-source or free software and cannot be researched and improved. S4is a general open-source distributed stream computing frame of Yahoo!. It is a subproject of Apache and is high speed developed.S4is a general, good scalable, partial fault-tolerance, Plugin supportable distributed stream computing platform. Programmers can develop stream application expediently on S4. Although S4has much superiority, it still has shortages on the node management aspect. Administrators cannot add or remove node of S4dynamically and cannot use GUI to manage and monitor nodes of S4.This paper improves the node management of S4. Firstly, this paper introduces the research and development situation of stream computing and S4, focusing on the drawbacks of S4’s node management. Secondly, this paper proposes the overall demands of distributed stream computing frame node management, including add and delete node dynamically and node management in the web way. Thirdly, this paper designs and implements the schema and architecture. This paper promotes and implements the two-procedure mapping node management schema for the demand of adding and deleting node dynamically. This paper proposes and implements the layered architecture of collection layer, analysis integration layer, presentation layer and user layer for the demand of node management in the web way. Fourthly, this paper executes functional tests and performance tests on the node management of the distributed stream computing frame and confirms the node management preponderance of this paper’s distributed stream computing frame. Finally, this paper states a brief summary and points out the next research direction.
Keywords/Search Tags:stream computing, distribution, massive datanode management
PDF Full Text Request
Related items