Font Size: a A A

Construction Of Data Warehouse And Implementation Of Analysis System For Health Examination

Posted on:2016-07-18Degree:MasterType:Thesis
Country:ChinaCandidate:Z SunFull Text:PDF
GTID:2428330542954585Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of health examination and the increasing of users,a lot of valuable data are accumulated in the health examination system.It has become a common problem that the relevant agencies are facing to provide decision support for doctors and managers effectively using health examination data.To solve this problem,we design and implement a health examination data analysis system in this paper.Firstly,we provide an independent health examination analysis environment by using the data warehouse technology.Health examination data warehouse solves the data storage and integration issues.In this paper,we discuss the facts?the dimensions and the grains of the dimension model in health examination area in detail.After the dimensional modeling,the health examination data in warehouse is reorganized into a structure which is suitable for analysis.The ETL system which is programed with high-level scripting languages,such as Shell and PL/SQL,implements the daily data loading and updating,and guarantees the maximum convenience and flexibility.Secondly,in order to implement the multidimensional analysis of health examination data and provide the ability of analyzing key performance indicators from several perspectives to the doctors and managers,we introduce the OLAP technology.The development of multidimensional analysis reports is simplified by using MSTR.The ROLAP server of MSTR can read the fact tables and dimension tables in relational data warehouse,and then transform the related data tables into a unified multidimensional model.We can customize the aggregation results of virtual cubes in the multidimensional model by configuring MSTR,and then provide health examination data multidimensional analysis report services for the doctors and managers.Finally,we discuss the method of health risk appraisal.Through applying classification technology in data mining technology,the relationships between users'medical test results and check conclusions are explored,and the risk forecasting models are established.We select the decision tree?naive Bayes and the support vector machine model,which are three common classification model,to perform experiments on a real health examination data set,and all the accuracies of these three classifiers exceeds 80 percent,which proves that the classification is a feasible method for health risk assessment.In addition,we discuss the imbalance data set problem in the experiment,and choose the over-sampling method to balance the training data.In the comparison experiment,we use SMOTE algorithm to preprocess the training data.The classification abilities of the few classes which we pay more attention improve significantly for all the three classifiers,meanwhile the classification abilities of the majority classes change little,which proves that over-sampling method is a feasible preprocessing method for health examination data set.
Keywords/Search Tags:health examination data, data warehouse, multidimensional data analysis, classification, imbalanced datasets
PDF Full Text Request
Related items