Font Size: a A A

Medical Insurance Data Mining Based On The Hadoop Platform

Posted on:2013-06-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiangFull Text:PDF
GTID:2268330425497380Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
The number of the people who participate in the medical insurance be about357thousand in the Qinhuangdao. The Income and expenditure of the health insurance fund generates2million data, tens of millions of billing information every year. The data in the database is still growing. There must be some connection between large amounts of data. Therefore, data mining has practical significance.As the amount of data, Stand-alone data mining will appear th time efficiency, downtime, etc., so we consider the parallel mining Hadoop platform. Hadoop is the open source platform, the PC machines will be able to build a clustered environment. The superiority of processing huge amounts of data, Hadoop has attracted many researchers.The purpose of this study is the use of the Hadoop platform to data mining on Medical insurance. Used the MapReduce programming methods to achieve parallel mining, and provide a reference reasearchers on the Hadoop platform for parallel data mining.Firstly, this article has made a detailed introduction to data mining technology and the Hadoop platform, focuses on the process of data mining and mining algorithms used in this project-clustering, association rules algorith. Through deeply study the database of the medical insurance system, we can identify the subject fields and then we could identify the corresponded data tables that correspond to the subject fields. At last we can start data mining. We compared and analyzed on the time efficiency and the experimental results, then obtain the experimental conclusions.
Keywords/Search Tags:Hadoop, Data mining, Health insurance, Clustering algorithm, Association rules algorithm
PDF Full Text Request
Related items