Font Size: a A A

The Key Technology Research Of Interactive Satistics And Analysis Of Army Medical Service Big Data

Posted on:2017-01-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:W W FanFull Text:PDF
GTID:1224330488955805Subject:Military Preventive Medicine
Abstract/Summary:PDF Full Text Request
Recently, with the continuously improvement of army health statistics informatization, it has achieved a great progress of data usage to provide health statistics query service for the leadership by building a website of health statistics. However, there are also some shortages such as the imperfection of the statistical indicators, the disadvantage of time-effectiveness and the slow response speed of interactive query which can not provide sufficient decision making supports.Currently, it has preliminary achieved automatic army medical service data extraction. The army health statistic work has gone into the era of big data by the reason of tens of billions of structed records had been extracted every year.According to the present process and softwares, it would spend a few days time on statistics and audit. It takes too long to satisfy the requirements of policy supports. Therefore, the original Health Dept, GLD of PLA has lunched the "Military Health Statistics Innovation Project" as the focus of the "twelfth five-year" army health informatization construction work in which big data statistical method and technology is one of the important supports.The interactive statistical and analysis of the army medical service big data can provide fine-grained statistical models based on massive amounts of original medical data by "day"; provide data support for medical decisions; grasp the distribution and utilization of medical resources timely;improve the public health service and public health emergencies disposal ability; and strengthen the guidance of medical services, management and supervision. It also can provide general methodology guidance for the country’s health statistics system and regional health platform; provide technical support for the army medical data service platform and so as to promote the innovation of medical management pattern from extensive to fine pattern.With literature research, comparative analysis, expert consultation, system analysis,investigation, empirical study and other research methods, this paper analyzed the current situation of the development in health statistics, defined and summarized some related theories and concepts of the army medical service big data’s sources, category and characteristics, construct the army health statistics indicators framework; according to the function and performance requirements of the statistics, analysis and usage of army medical service big data, summarized the interactive statistical data processing method and procedure of these large samples, ditributed, homogeneous, structed,complex corelation big data which extracted from more than 200 army hospitals; put forwarded a parallel computing solutions based on the Spark, and chose the key technologies for data pre-treatment, distributed storage, interactive statistics and multi-dimensional visualization function modules; accomplished the architecture designment of the army medical service big data interactivestatistics and analysis platform; achieved the prototype based on Spark in-memory computing platform; and on this basis, the functionality and performance of the prototype have been compared and validated by using 6 different test data sets and 8 nodes of Spark cluster.1. The service requirements analysisAccording to the service requirements of health, this paper analyzed the functionality and performance indicators of the statistical and analysis platform of medical big data. At first, it summarized some related concepts of army medical service data statistics and then gave an overview of the domestic and foreign research development and present situation, and the medical big data is summarized as “Large Sample and Complex Correlation Data”; secondly, there was a systematical analysis of medical service big data source, categories and characteristics; thirdly, by classifying the existing military health statistics indicator, a five levels’ military health statistics indicators framework which composed by business fields, business subjects, statistical purposes, dimensions and analysis indicators has been constructed; at last, the function and performance requirements of interactive statistics platform has been proposed.2. The key technologies selection for interactive statisticsThis paper analyzed the general interactive statistics procedure of the medical service big data,compared the key technologies should be used in distributed storage, NoSQL database, computing platform and data visualization framework; combining with the characteristics of medical service big data and statistical analysis requirements,chose the suitable technologies: Sqoop as the medical service data ETL tool by which incremental updating services be supported, Hadoop and HBase provide the distributed storage services for the medical service data and the calculation results, Spark provide an interactive and efficient parallel computing services, and Web2 py provide multi-dimensional visualization display service.3. The architecture designment of medical service big data interactive statistics platformAccording to the platform construction goal, the whole architecture can be divided into three basic modules: external data access and storage, data analysis, multiple interactive query and data display. Corresponding architecture and algorithms which include data pre-processing, efficient storage, parallel computing have been designed respectively.4. The implementation and verification of prototype systemThe results of the previous sections were applied to guide the system prototype designment,development environment deployment. By using the data which involed in outpatient service processing, for example, an instance of medical service big data interactive statistics and analysis was running in the Spark computing cluster so as to verify the function of the system. On this basis,the performance of the prototype system were compared and validated by using 6 different test datasets and 8 nodes of Spark cluster. The representative calculation test inclued simple grouping,summation and multiple table joining inquiring.By using big data technologies such as ETL tool Sqoop in which incremental updating can be supported, distributed file system HDFS, distributed database HBase, in-memory computation framework Spark and efficient visualization display platform Web2 py, the development of military medical service big data interactive statistical and analysis platform system prototype can perform the interactive query statistics of more than hundred million medical service records. It can achieve nearly linear increase of task processing efficiency by adding hardware resource nodes.This study is an application research of big data processing technology in the medical service data interactive statistical and analysis. It can provide some first-hand practice and reference for the construction of army health statistical information platform and the medical services big data’s further mining and usage.
Keywords/Search Tags:Medical Services, Big data, Interactive, Statistics, Visualization, Spark
PDF Full Text Request
Related items