Font Size: a A A

The Application Research On Data Cleaning Key Technology In Medical Insurance Management System

Posted on:2015-12-08Degree:MasterType:Thesis
Country:ChinaCandidate:Z J CaiFull Text:PDF
GTID:2298330434960856Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years, because of attaching great importance to the health care problem, themedical insurance management was greatly improved and realized the basic goal of the"universal access to basic health insurance". As the medical insurance management systemrunning, the data will gradually become very large. Since medical insurance managementsystem is managed by a man, so it maybe occur some error inevitably. In integrated systems,because the data is from a number of different data sources, so the date will be inconsistencies,abnormal and so on. In addition, because of the flaw of insurance managementsystem,duplicately insure is widespread. For these reasons, the "dirty data" in database willgradually increase, affecting the normal operation of the system. Therefore, Cleaning up the"dirty data" is very imortant for the medical insurance management system.This thesis introduces the concept of data quality, data quality and metrics, explains theconcept, principles and basic steps of data cleaning. Then, it introduces and analyzes for avariety of data cleaning strategies.Firstly, this thesis analyzes the data of medical insurance management system andselectes the data cleansing techniques.we discuss the generation of medical insurancemanagement system, introduces some of the major functional modules of the system; Then,discusses the data problem of the medical insurance management system and analyzes thedata, and take the actual needs of the medical insurance management system intoconsideration,then analyzes several duplicate detection algorithm from data cleaningtechnology, according to main characteristics of "dirty data" in the system,chooses the SNMas the core technology to clean up duplicate records.Secondly, this thesis Chapter III data and the demand for medical insurancemanagement system analysis, the first record of the calculation method of calculating thesimilarity to improve, and improve methods for SNM Finally, through experimental analysisof the improved algorithm to summarize and made a similar duplicate records, incompleterecords handling and exception records.Thirdly, this thesis according to the analysis of data in medical insurance managementsystem and the demand for data cleaning in chapter3, improves the calculation method ofcalculating the similarity and the SNM method. Finally, gets a summary through theexperimental analysis of the improved algorithm and proposes some methods of processingsome duplicate records, incomplete records and abnomal records.Finally, design a function module of data cleaning, and embeds it into a medicalinsurance management system for application implementation, and perfectly solve theproblem of data quality which exists in medical insurance management system problems.
Keywords/Search Tags:Data cleaning, Medical insurance management, Data analysis, Duplicate records, SNM
PDF Full Text Request
Related items