Font Size: a A A

Implemention Of The Massive Telecom Data Distributed Storage And Query System Based On Hadoop

Posted on:2017-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:Q AnFull Text:PDF
GTID:2348330518993363Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the rapid rise of the mobile Internet,smart phones with a variety of functions are springing up in China,greatly changing users' way of connecting to the Internet and their behaviors.Besides,the data produced by users of telecommunications services grow exponentially in size,and various industries are undergoing profound changes imperceptibly.As traditional relational databases fail to store these big data or search among them,platforms capable of storing and efficiently processing big data should come into being.As a result,plenty of cloud computing platforms show up,among which Hadoop is the most approved in terms of distributed technology.Hadoop,a product of Apache,is a software framework able to conduct distributed processing of a large amount of data in a reliable,efficient and scalable way.Hadoop consists of two core components,HDFS and MapReduce,a parallel computing framework.HDFS,the basis of data storage management in distributed computing and accessed based on the streaming data mode,can easily process very large files.HBase,a distributed database with column-based storage,is suitable for storing large data and responds to real-time inquiries in the single Rowkey way.The database can use MapReduce for the bulk loading of data or a distributed full scan among all the data.Currently,an increasing number of Internet operators invest enormous manpower and material resources in developing Hadoop and HBase,and great achievements have been chalked up.Based on HDFS of Hadoop and the feature that HBase responds to inquiries,this paper aims to design and implement an application managing data storage and rapidly searching among massive amounts of data.The paper is composed of several parts as follows.The first part studies and discusses technologies related to Hadoop,mainly HDFS,new and old versions of MapReduce framework,and HBase distributed database.The second part designs and implements a HDFS file browser based on web,that is convenient for users to manage the HDFS files.The last part implements the multi-condition query of HBase with the combination of Solr full-text search server and HBase.
Keywords/Search Tags:Multi-condition Query, Hadoop, HDFS, HBase, Solr
PDF Full Text Request
Related items