Font Size: a A A

The Research And Implementation Of Massive Reproductive Health Data Integration System Based On Lucene

Posted on:2012-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:T Y LiuFull Text:PDF
GTID:2178330335460882Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Modern population and reproductive health public service platform is oriented to the national population of child-bearing age, institutions and enterprises which provide relevant service product, the platform provides market driving force for the modern reproductive health services industry which bases on the huge population resources. The data of the whole platform includes relevant school-age population data, institutional data, corporate and product data etc. Along with the expanding constantly of the scope of application, the data will increase rapidly. Moreover, the database system of each enterprise' information system are not identical and lack of uniform data description standard, the systems cannot write to each other and also cannot share data between each other, the phenomena of information island is very obvious. Facing such a large database application system, how to provide a unified mass data integration platform, how to integrate and access these data effectively become urgent problems. This paper is to solve these problems encountered in the platform.From the needs and problems, this text brought up massive reproductive health data integration system model based on Lucene. This model referred to the view technology of database, proposed the concept of user view which shielded the heterogeneity of each data sources and realized the transparent access. The query efficiency is increased due to the highly effective categorization algorithm to segment data as an index with open source tool Lucene and parallel union query technique, which greatly improves query efficiency, solve the problem of mass data access efficiency. For users' simple request, the data can be directly returned from the index files or cache, which greatly reduces the connection frequency of the remote data source. For those data source which have higher query frequency, this model maintains these connections in a database connection pool, this could reduce the connection and close frequency, which improves the performance overhead mass data query efficiency. The model of massive reproductive health data integration system proposed by this paper, which combined the schema integration method and data replication method, formed an application system with the functions of massive heterogeneous data integration capability, data-accessing transparently, efficiently, data source plug and play. The model took the thought of service oriented architecture (SOA), which accorded with the development trend of information technology. The mass data integration model this paper discussed is widely used in similar system.
Keywords/Search Tags:data integration, massive, Lucene, parallel union query, user view
PDF Full Text Request
Related items