Font Size: a A A

Research And Implementation Of Health Data Management System Based On Hadoop

Posted on:2018-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:L WangFull Text:PDF
GTID:2348330518499387Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the continuous improvement of community residents' living standard and health concept,the data management platform based on residents' health information has gradually become a new model of healthcare informatization service driven by Internet technologies.And the emergence of cloud computing and big data,not only solves the problem of massive data storage from medicine institution,but also provides powerful capabilities in data mining and computing,which reflects great advantages of medical data informatization.In this paper,considering the shortcomings of traditional relational databases in storage,computation,retrieval and so on,for solving the problems with acquisition and storage of health data,combining with the theory of cloud computing storage,we establish Hadoop health data management system based on residents' health data relying on the current trend of big data applications.At the same time,according the basic needs of residents' health data,we propose a scheme for storing and retrieving unstructured data based on HBase+Solr.Finally we provide convenient healthcare information service for data storage and access which will improve the quality of community healthcare service much further.This paper first focuses on the key position of big data platform in healthcare system,and we deeply study and research the key technologies of Hadoop and HBase.Based on the theory of current condition,according to the actural demand of the system,we analyze and design the overall architecture about system based on Hadoop.Then the modules about system function are divided,and each sub module is designed in detail.The paper adopts column-oriented database(HBase)to store large amounts of unstructured health data,and implements the standardized modeling for health data by the selection of health metadata set.Meanwhile,the system solves the problem about data transmission and storage in the process of synchronous residents' health data by making full use of Hadoop cloud storage framework.And in view of the lack of HBase in non-rowkey and multi-condition data query,the secondary index design based on Solr cluster is used to improve the efficiency of data retrieval service.Finally the system makes full use of Map Reduce to make the statistical analysis of physiological data from residents.During the system testing phase,the reliability and performance about Hadoop cluster are simulated and tested.Finally the feasibility of the system is verified by the comparative analysis of test result.
Keywords/Search Tags:Cloud computing, Health data management, Hadoop, Secondary index, MapReduce
PDF Full Text Request
Related items