Font Size: a A A

Schema Matching In Super-Peer Based P2P Database System

Posted on:2011-12-01Degree:MasterType:Thesis
Country:ChinaCandidate:J FengFull Text:PDF
GTID:2178360305951073Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Nowadays, Schema matching is already being a hot topic in many fields. Such as: data integration, data warehouse and data mining. It was used mainly to provide corresponding relationships between attributes of heterogeneous schemas, to which the key problem is how to find the semantic corresponding between attributes. Always, schema with different structure, definition, naming conventions. So, in order to find the corresponding relationship we have to combine various schema matchers, for example, structure matcher, instance matcher, restriction-based matcher. What's more, the construction of schema matcher may different with different application domain. Currently none of exist theory or system is totally automatic, which needs manual intervention more or less. Before we construct a schema matcher we should consider the application background comprehensively. Though the process of implementation matching capability is complex, it's still of high valuable with greatly deceasing hand labor and increasing the Efficiency.In this paper we focus on the functionality and implementation of schema matching method with the environment of data sharing under peer-to-peer system. Other than traditional network, p2p characteristize:every peer in p2p is equal in function, which can provide resource to others also can send request to others, beside that any peer can add into or leave the network freely. P2P system has not only one structure, different topology structure suits for different application. Here, we prefer the super-peer as the base structure after considering our application comprehensively. In this paper, we mainly focus on the two parts:1) Construct a new super-peer based p2p topology, and then we do a deep-going research on the function of schema matching. 2) According to the new structure we proposed, we implement an instance-based schema method which suit for our environment well.With regard to the first part, we propose a new topology employ domain them divided with double super peers p2p structure. What different with the common schema matching method, the schema matching in p2p with its distinctiveness:firstly, any p2p system is characteristic with scalable, which means peers can join the network or disconnect at random. So, how to deal with the schema relationship of the new added or leaved nodes in such a dynamic environment is need to be considered. Secondly, take the information query for example; what type of structure can get an optimal efficiency. In other words, how to organize the structure relation between the super node and normal node can we get a better query effect. All these problems are get detailed in our paper.For the second part, we firstly do much research on the function of schema matching to our p2p network, which is one of the most important issues in our paper. In fact, most of the current relative research all employed the schema matching technology, but nearly all of them pay attention to how to improve the query route and query algorithm. In this paper, by putting forward a hypothesis model:the two attributes has relative importance in two schemas, if they have corresponding semantic relationship. Here, we apply the RF neural network as multi decision tree and RBF algorithm as classifier to extract feature, in order to find the unique attribute position. At last we construct a minimum Euclidean distance function to discriminate the best corresponding pairs. Our method is much fit to our background: 1) multi-decision tree can produce accurate matcher. 2) The peers in the whole system can deal with a great number of data and its variability. 3) Domain metadata table suit for new added or leaved peers.In the end, our approach is validated by UCI datasets and the results show good accuracy.
Keywords/Search Tags:P2P system, Super-peer structure, Schema matching, Scalable, Semantic- corresponding
PDF Full Text Request
Related items