Font Size: a A A

Design And Development Of HLS-Ⅱ Data Archiving And Retrieval System Based On HBase

Posted on:2021-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:S C XinFull Text:PDF
GTID:2392330602999035Subject:Nuclear Science and Technology
Abstract/Summary:
HLS-Ⅱ is a dedicated synchrotron radiation source with featured spectrum in vac-uum ultraviolet and soft X-rays regions.A large amount of historical data will be gener-ated during its operation.The retrieval speed of historical data is critical for performing the performance analysis and fault diagnosis.Currently,these historical data are stored in the Oracle relational database.With the increasing amount of historical data stored,the storage and processing of massive data has become the bottleneck of data query speed and data analysis performance.The existing HLS-Ⅱ data query system is an Oracle-based web application system.When querying historical data with a large time span,due to the large amount of data and the long response time,the user’s experience needs to be improved.After investigating and surveying the current status of the his-torical data archiving and retrieval system of accelerator devices at home and abroad,HBase-base Data Archiving and Retrieval System(HDARS)is designed and developed to meet the storage needs of massive historical data and improve the retrieval speed of historical data.In terms of data storage,HDARS builds hundreds of terabytes of Hadoop data stor-age platform and uses HBase distributed database for historical data storage.Compared with Oracle data storage,it improves the retrieval performance of massive data,the reliability and scalability of data storage.In terms of data query speed,HDARS im-proves performance from terms of data archiving and data retrieval.In the process of data archiving,Archiver Appliance is used as data archiving software,and HBase data stor-age plug-in is developed to realize data migration from Archiver Appliance to HBase.In order to solve the problem of insufficient data query speed for large time spans,a data extraction algorithm is designed to extract the characteristic data from the raw data with the different time granularities.The raw data and the characteristic data are stored in the raw data table and the redundant data table in HBase,respectively.In the pro-cess of data retrieval,when querying historical data with a large time span,the data retrieval logic will calculate the proper time granularity according to the time range of the query.Then,the characteristic data with this granularity in the redundant table are retrieved and returned.In addition,the current popular web technology is used to develop the HDARS web application,which has realized historical data visualization,system access authentication and authorization management and Archiver Appliance management.In the design of HD ARS,the problem of data retrieval speed with a large time span is solved at the cost of a small redundant storage space.Since HDARS started testing in January 2020,its performance is stable,and it can respond to query requests in any time range within 1 second,which fully meets the user’s performance requirements for HLS-Ⅱ historical data query.
Keywords/Search Tags:Data archiving, Data retrieval, HBase, Data extraction algorithm, Redundant storage, HLS-Ⅱ
Related items