Font Size: a A A

Research Of Testing Method Of HBase Driven By Its Architecture And Business

Posted on:2014-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:S D HuangFull Text:PDF
GTID:2248330398455179Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The development and wide application of the Internet technology produces a large amount of data and the relational database can not satisfy the new business needs. In order to cope with the storage and management of the massive data, some companies and organizations have launched the NoSQL storage programs.HBase is a famous one of the programs which has been widely used in various industries. In this context, the paper carried out the research about the performance testing of HBase.Firstly, the paper introduces the rise and development of the NoSQL database, reviews the general process and the current situation of the database performance testing, found that the traditional testing method does not consider the influence of the performance elements of the system’s architecture. But as to the new features of the current NoSQL database’s architecture, such as the dynamic schedule, data storage with multi-replication and configurability which have an important effect on the performance, as well as the behavior of users in the specific business. Based on the fact, the paper proposes a new general performance testing model for NoSQL database system which including the performance testing driven by the architecture and business. Then, the paper explores the testing method of HBase guided by the testing model.As to the performance testing driven by the architecture, we study its implementation of the column-oriented data model, data writing and reading process, the partition of the data table, the mechanism of the data replication and analyze the performance elements such as the single field and multi_fields in the same column family, the single column family and the multi_column family, the size of the region and the factor of the data replication. Then design the corresponding testing programs based on these performance elements to test the impact ion on the performance of HBase.As to the performance testing driven by the business, we mainly explore the performance testing method of HBase in the search engines business. Because the principles of the variation of the user visits in the search engine presents the periodicity, the paper uses the potential cycle model based on time series to model the user visits, and apply the model to control the number of the concurrent threads in the HBase’s performance testing. At the same time, the abnormal data is removed by the wavelet analysis technology. In order to reflect the real performance of HBase in the search engine business, design the testing program based on the characteristics of the user behavior which was realized and verified based on the testing suites YCSB.Finally, give a summary about the work of the article, and an outlook about the future work.
Keywords/Search Tags:Testing Model, Data Partition, Time Series
PDF Full Text Request
Related items