Font Size: a A A

Research And Implement Of The Data Subscription System In Space Big Data Platform

Posted on:2016-12-10Degree:MasterType:Thesis
Country:ChinaCandidate:F ShaoFull Text:PDF
GTID:2348330488974400Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the great-leap-forward development in the current information field, various new technologies and innovation continually enhance the quality of our life. Information technology produces vast data in each field, so the technologies of the Internet and Big Data have been utilized to solve these issues of data processing. As a national priority area, the aerospace industry has been more intelligent in recent years and various types of data are dramatic rising. The traditional storage model and calculation model have been unable to meet the request of the massive space data storage, analysis and visualization needs. Inspired from the Internet and Big Data-related applications, the industry employs them to solve the problems of data analysis and processing. The types of data in the aerospace field are various and each treatment of them differs, and therefore the platform of Big Data is constructed to do efficient and secure data storage management and rapid data analysis. How to share Big Data securely between different platforms is a problem for that the platforms between the aerospace research agencies are independent and the data cannot be shared efficiently.In this paper, the data subscription system was used to solve the problem of data sharing between aerospace Big Data platforms, and to enhance the aerospace scientists' collaborative ability of data analysis and processing. The business needs of data sharing and the features of aerospace data were deeply studied in the process of researching the data subscription system, and the problem of unified authentication and authorization between different aerospace Big Data platforms was solved so that a unified network was made between different platforms. In addition, the task scheduling system was a resolution to meet Cron Expression and the periodic task scheduling problems of a lot of meta-information synchronization, data file synchronization and identity synchronization. Through the transformation and design with HDFS distributed file system and the corresponding data meta-information, a large concurrent data synchronization center was implemented to do reliable data transmission between the aerospace Big Data platforms. What's more, the Data sharing system met the demand of security, high speed and efficiency.In the process of system implementation, we proposed solutions to each problem of the data subscription system. With SSO single sign-on technology to do the authentication and user authorization separation, a unified authentication center was implemented, which solves the problems of the mutual visits between Big Data platforms and ensures the security and independence of the platforms. We designed the policy-based separation mode of the meta-information's real-time synchronization and file data according to the characteristics of aerospace data to ensure the experience of the client. And through the subscription system, users can directly request to the backstage data of the subscribed platform. Data synchronization center based on TCP socket protocol used C / S model to complete the adaptive concurrency, encryption security, fault-tolerant data synchronization system. For subscription data whose type uses Map Reduce distributed computation to do data analysis, we directly synchronized raw binary file in subscription-side to process so as to reduce network traffic. For the file Sync which is based on the unified characteristics of the time repeat policy, meta information's real-time synchronization and scheduling tasks of the platform, the syntax design of Cron Expression timed task was used to implement a unified task scheduling system, which also met the Map Reduce task scheduling based on the platform states. The system consists of client and Big Data platforms, with RESTful service to publish various controlling API of the subscription system.We verified the reasonableness of the research process scheduling center system architecture's design though efficient tests. The task scheduling center can meet the balance scheduling of large concurrent task. Meta information and data files can be synchronized in accordance with policy. The data subscription system can solve the problem of data sharing between aerospace Big Data platforms. The current Data subscription system has been successful on-line services in multiple aerospace research institutions.
Keywords/Search Tags:SSO, Task Scheduling, Hadoop, Redis, TCP
PDF Full Text Request
Related items