Font Size: a A A

Research On Data Sharing Among Heterogeneous Databases

Posted on:2009-03-28Degree:MasterType:Thesis
Country:ChinaCandidate:J F WuFull Text:PDF
GTID:2178360278457213Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The accomplishment of data sharing makes much more people can use data source sufficiently, reduce rehanding and corresponding cost such as data acquisition, data gathering and so on. But, in the process of sharing data in practice, due to the heterogeneity of hardware, OS, network protocols, database types, semantic representations between database systems where the data stored has become one by one "information isolated islands", which seriously blocks the data sharing and circulation plans.To implement sharing data among heterogeneous databases, it is actually a problem of eliminating heterogeneity between databases by some technical methods. The key solution to the problem lies in elimination of heterogeneity inside databases itself, though it is easier to the problem outside databases. According to insufficiency of current solutions, the thesis focuses on the differences between data types, semantic heterogeneity of attributes and schema conflicts of databases, which will be discussed separately.The main problems concerned in this thesis have been listed as follow:Aim at localization of current solution to attribute matching using features extracted from database dictionary and statistics of attribute values which can not reflect the actual meaning, the thesis proposed an improved attribute feature extraction method, which extracts the feature of attribute properly. Experimentally verified, our method can improve efficiency of attribute matching, while not losing any accuracy of matching.Aim at the shortage of neural network in attribute matching, the thesis treated the problem of attribute matching as finding nearest neighbor in vector spaces by abstraction of the problem itself. On analysis of the reason why current k-means and SVM-kNN algorithm which were improved from traditional kNN algorithm do not suitable to attribute matching, we proposed an algorithm in data sharing called FKNMatchAD-kNN which was based on pretreatment of vectors. Through comparison experiment of neural network algorithm and our approach, we verified our improvements in efficiency and elimination of noise data, also in precision of attribute matching.Aim at lack of university of current approach in elimination of schema conflicts between databases, standing on semantic mapping rules, the thesis proposed a query translation algorithm called QTMR which was based on rules. Experimentally verified, the algorithm can efficiently eliminate the schema conflicts in distributed environment.At last, based on fully concerning of requirement of data sharing in distributed environment, the thesis proposed a design of data sharing, and produced composition of the main functional module, work flow and solutions to key technology in detail.
Keywords/Search Tags:Heterogeneous databases, Data sharing, Attribute matching, Query translation, QTMR, FKNMatchAD-kNN
PDF Full Text Request
Related items