Font Size: a A A

Record Linkage Model In Data Matching

Posted on:2014-07-24Degree:MasterType:Thesis
Country:ChinaCandidate:J S XieFull Text:PDF
GTID:2268330425989575Subject:Statistics
Abstract/Summary:PDF Full Text Request
With the rapid development of computer and internet technology as well as the deepening of information process, both governments and enterprises are more dependent on the analysis of the data in the decision-making. However, as the lag in updating of database and the high maintenance costs, it’s quite difficult to establish an effective connective mechanism between diverse databases. Countries overseas using record linkage methods to deal with redundant data integration and multi-file information:it is a method comparing the special number, name, address and other information in different file’s (or the same file) to judge the recording which behalf the same entity. This method can improve the data quality effectively and solve numbers of problems by applying in tax and immigration departments, social welfare, business data analysis, medical drug monitoring and disease surveillance.This thesis introduces the record linkage integrally, and gives analysis on the school records and household registration through records link model. The last section will give a applied prospect of the record linkage in China.The thesis first state the theoretical basis of the record linkage. The basis consists of three parts:the pretreatment on record during linking recorded process overseas; deterministic record linkage; probability of record linkage. Standardization of pretreatment and modular records can be used as a reference to domestic Chinese records processing. This paper describes the basis of the records link mode in detail:building process of Fellegi-Sunter model, and summarized parameter estimation method of the model. Next part will apply the Fellegi-Sunter model in school records and household registration, estimating parameter by EM algorithm. This method can effectively solve the problems in the linking process, and can obtain satisfactory estimating result. Last chapter sums up the inadequacies of the article content and research, and give a applied prospect of the record linkage in China.
Keywords/Search Tags:Record linkage, Blocking, Fellegi—Sunter model, EM algorithm
PDF Full Text Request
Related items