Font Size: a A A

Medical Health Data Analysis System Design And Implementation Based On Spark

Posted on:2018-08-14Degree:MasterType:Thesis
Country:ChinaCandidate:T XiaoFull Text:PDF
GTID:2348330518995363Subject:Cryptography
Abstract/Summary:PDF Full Text Request
With the increase of social pressure and the accelerating pace of life,more and more people are facing different levels of health problems.Quickly and accurately obtaining people's health status can help them take some reasonable and effective health advice earlier and it will also reduce the the probability of disease to a certain extent. In recent years,The application of big data has greatly promoted the intelligent development of medical and health.The forecast of medical health and disease has become an important part of the wise medical. At this stage,the forcast of health care is mainly use the EHR(Electronic Health Record) and the source of data characteristics is relatively single.For the result,it is only in view of the disease occurrence or not,people can not accurately obtain their health status .Meanwhile,in face of the massive data sets,there exists the problem of inefficiency.According to the shortcomings of the present medical health prediction system,we add some mobile health data and people's living habits data as feature selection based on the EHR and analysis the influence of these features on the body health.Meanwhile,based on the temporal continuity of some health feature,this thesis introduces the concept of time-window,divides a health feature into multiple attributes according to the established time window,designs a logistic regression prediction model based on time window and this model has a higher accuracy.In order to enable people to obtain their health status more accurate, this thesis introduces the concept of sub-health and uses random forests algorithm to divide people's health status into multiple levels.Finally, combining with the distributed file system HDFS,this thesis uses the log collection system Flume and the message queue Kafka to design and implement a medical health data forecast analysis system with Spark as the core.The system can preprocess and format the collected health data and store it in HDFS as the model training data.After the algorithm implemented by Spark MLlib,the final result will be stored in the database Mysql.In the end,the prediction result is showed to users by the Web.
Keywords/Search Tags:Medical health, Machine learning, Health record, Health prediction
PDF Full Text Request
Related items