Font Size: a A A

The Design And Implementation Of The Offline Bigdata Middleware System Based On Thrift Frame

Posted on:2017-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:S C WangFull Text:PDF
GTID:2308330503469559Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years, large data become a household word.Baidu,as the leader company of current domestic leading search engine, with massive data resources. Based on baidu’s massive user data, at the same time, combined with the depth of the vertical search, to mining thousands of tag data, and give insight of user on stereo of space and time.This topic using PHP as a development language, the language is widely used in baidu, Thrift framework is very powerful, with cross-language service ability. Through this ability and baidu offline business data as the foundation, developed a set of data system, collection of data query, data calculation, data storage, resource scheduling, statistical reports, and other functions.With the current development status and development trend of Big data platform as a starting point, in-depth analysis of the necessity of establishing big data platform, summed up the platform to develop best practices, to cater to the current form of business, from a technical point of view the feasibil ity of using Thrift framework is introduced, and the current commonly used tool in the area of data development, Hadoop, Hive, HDFS and Query Engine Engine. In order to meet business needs, users of the system design into dual rights management, its background of business requirements, reasonable development and layout.This paper using baidu ODP as development tool which is a Development environment framework specifically for the PHP language. The front-end technology uses Smarty technology, part of the page using Java Script technology. Docking metadata platform and user portrait system. To provide users with convenient full-service.In the design, according to the user’s functional requirements and use of process, divided into four modules: query module, calculation module, data statistic module and management module. Query module is to provide the information of data query and tasks query. Data calculation module is the core, to provide users with building tasks, data computing, storage and complex functions such as machine scheduling. Statistical analysis module provides the user data table statement to set up and display. Management module control the users permissions and alarm monitoring, etc.Finally, the article implements different modules, design the test cases. These test cases show that offline data middleware meets users’ requirements, achieve the requirement of the on-line use, used within the company.
Keywords/Search Tags:Big data, Thrift framework, Hadoop, Data statistics, Data platform
PDF Full Text Request
Related items