Font Size: a A A

Research And Implementation Of Big Data Analysis Platform Based On P2P Scalable Architecture

Posted on:2013-06-22Degree:MasterType:Thesis
Country:ChinaCandidate:A ZhuoFull Text:PDF
GTID:2248330392458482Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet applications and informationsocialization, the data was at the explosive growth that the traditional relationaldatabase meets performance and scalability bottlenecks to analyze such largescale data, so it is necessary to study new and effective data analysis platform.Big data technology is not mature and hasn’t formed a standard, but the industryhas widely used Hadoop as its big data processing platform, which also drivesacademic to study Hadoop related technologies. Besides Hadoop, the NoSQLrelated technologies also developed rapidly, and several famous open sourceprojects come out, such as HBase and Cassandra.Firstly, the paper surveys on the big data processing and analysis platform.Secondly, based on comprehensive comparative analysis of the existing big dataplatform, introduces the architecture of Tsinghua Kloud that is theLaUDMS(LaSQL Unstructured Data Management System)’s core platform.Thirdly, introduces the research and implementation of Kloud’s big dataanalysis platform followed by the main work of the paper. Research in-depth thecomponents of Hive and make Hive with Cassandra integrated. To make Hivecomponents fully integrated into the Cassandra, define the object-oriented datamodel on the Cassandra schema-less table to store the Hive metadata. To makecondition query on Cassandra more effective, design and implement secondaryindex on schema-less table and integrate into the distributed index plug-inframework of Hive to optimize the performance of Hive analysis. The paperanalyzes user access log on the big data analysis platform described here andachieves the good performance and availability.
Keywords/Search Tags:big data, MapReduce, schema-less table, secondary index, datamodel
PDF Full Text Request
Related items