Font Size: a A A

Design And Implementation Of Communicable Disease Surveillance Data Warehouse

Posted on:2014-06-19Degree:MasterType:Thesis
Country:ChinaCandidate:X S ZhouFull Text:PDF
GTID:2268330398489952Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
Along with the continuous development of disease surveillance, prevention andcontrol, the data expands explosively. Data storage, management and analysis modewith typical operational database can no longer meet the demand for analyzing massdata. In order to meet the timeliness and accuracy requirements of diseasesurveillance, prevention and control to solve the problems caused by informationasymmetry such as “information silos“and “data redundancy”, computer is expectedto be able to efficiently handle massive daily data, and at the same time be moreinvolved in data analysis and decision support. Currently, the military infectiousdisease surveillance reporting system is in on-line transaction processing mode, thedatabase design is not optimized for data querying and analysis. The server for queryand analysis and the server for business process use the same database, whichaffects the business performance of the system when analyzing complex queries.Meanwhile, the fixed analysis mode, the inefficiency and inflexibility in thedevelopment of new analysis and functions can’t meet the demands of the diseasecontrol agencies and health service management authorities for real-time, flexibledata analysis and decision support.Data warehouse technique, which serves as an emerging data storage andorganization technique for data analysis and decision support, gradually becomes aneffective solution for efficient management of massive data and in-depth analysis. Inthis paper, we have researched the ways to use data warehouse technology forsolving the problems of the military infectious disease surveillance report dataanalysis and decision support, and have designed and implemented the militaryinfectious disease surveillance data warehouse based on specific business needs withavailable data source.Firstly, the paper carries on the demand analysis of the military infectiousdisease surveillance data warehouse. Through in-depth analysis of the status of thecurrent military infectious disease surveillance reporting system, we summarize theadvantages and disadvantages of the current system in the aspect of systemconstruction, the ways of data transmission and the ways of query and analysis. Onthis basis, we make detailed provisions on the functional requirements, performancerequirements and other requirements of the proposed data warehouse. Thefunctional requirements include the daily business reporting, online analytical processing, data ETL processing and system management, covering the basic needsof all types of users for infectious disease surveillance data analysis and decisionsupport applications; Performance requirements mainly make specific provisions ondata consistency, characteristics of the time and system security in order to ensurethe normal operation of the system; We also make provisions on the system usingenvironment, the data transmission and the maintenance management.Secondly, we complete the design of the infectious disease surveillance datawarehouse system, mainly including the system technical architecture andmultidimensional data model. Through an in-depth analysis and comparison of keytechnologies of the data warehouse development, we complete the technologyselection. We choose the triple-layer architecture which consists of the data sourcelayer, the data coordination layer and the data warehouse layer. The physicalarchitecture consists of the data source server, the operational data store server, thedata warehouse server, the BI application server and the client PC. We design thedata warehouse bus matrix in which three dimensions (disease, organization andregion) are shared by two subjects (the epidemic situation and the epidemic reportaudit) in accordance with the bottom-up modeling approach. The system adoptsDimensional Fact Model for conceptual modeling and the star schema for logicalmodeling. In addition, the basic granularity of the epidemic situation subject is acertain patient getting a certain infectious disease at a certain time with thedimensions including diagnosed time, start time, patient, post and case information.The basic granularity of the fact table for the epidemic report audit is a certaindisease control staff submitting a report at a certain time with the dimensionsincluding report time, report information, reporter information and auditinformation. On the basis of the above design, we complete the design of the specificfact tables and dimension tables.Thirdly, by taking a regional reporting system of epidemic and public healthemergency information as the data source, we implement the infectious diseasesurveillance data warehouse including the multi-dimensional data model, the datapreparing and the OLAP system, employing Oracle BIEE and Oracle WarehouseBuilder as the software platform. We use Oracle Warehouse Builder to build a datawarehouse multidimensional data model, including source system analysis anddata-driven building of the multidimensional data model. In the section of datapreparing, we complete the further examination and standardization of data sources, using the PL/SQL script to clean data automatically. We complete the two stages ofETL process which are from the raw data to the unified operational data and fromthe unified operational data to the multi-dimensional model data by taking theOracle Warehouse Builder as data ETL tools. We take the Oracle BIEE as the main toolto complete the development of the OLAP system covering the daily businessreporting, the online analytical processing and the system management functions.This paper is the first work to establish the military infectious diseasesurveillance data warehouse, which is deployed in the data center of the CDC of thearmy, providing services to authorized users. The system contains surveillance data of83official infectious diseases. Up till now, the system has stored approximately900,000pieces of reported data and samples approximately1000pieces of data intothe data warehouse every day. The system, which solves the insufficiency of ability ofthe typical operational database in data storage, management, and analysis of theinfectious disease surveillance data, is an efficient solution for the in-depthmanagement and analysis of infectious disease surveillance data. What’s more, itprovides the technology base for further applications such as data mining. Thesystem is of great reference value for the early warning of infectious diseases, thedecision supporting and related research.However, problems such as relatively insufficient data sources and limitedsubject analysis still exist in our data warehoused design. In further work, we aim atintegrating the data sources more extensively, extending the subject analysis, anditeratively developing new data marts for new business requirements.
Keywords/Search Tags:military infectious disease surveillance, data warehouse, onlineanalytical processing, multidimensional data model
PDF Full Text Request
Related items