Font Size: a A A

Health care data analytics using Hadoop

Posted on:2017-08-17Degree:M.SType:Thesis
University:San Diego State UniversityCandidate:Yaramala, DeepthiFull Text:PDF
GTID:2458390008955190Subject:Computer Science
Abstract/Summary:
In today's world, due to technological advancements, the amount of data that is getting generated is growing rapidly. Enterprises worldwide will need to perform data analytics with these huge data datasets to make business decisions and stay competitive.;Storage of data sets and performing data analytics was traditionally accomplished using RDBMS (Relational Database Management System). However, RDBMS would be inefficient and time consuming when performing data analytics on huge data sets. Hadoop came into existence recently and overcomes the limitations of existing RDBMS by providing simplified tools for efficient data storage and faster processing times for data analytics.;The purpose of this work is to study different Hadoop functionalities in detail and perform data analytics on a health care data set using Hadoop. A health care data set comprising of 1.5 million patient records is considered for the data analysis. Different use cases have been considered and analytics have been performed using MapReduce, Hive and Pig functionalities of Hadoop.
Keywords/Search Tags:Data, Analytics, Using, Hadoop
Related items