Font Size: a A A

The Database Schema Research And Matching Method

Posted on:2013-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:F J LiFull Text:PDF
GTID:2248330374486154Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the progress of the times, the Information technology is the main theme ofdevelopment, which XML has become the best carrier for the WEB data exchange andinformation exchange. As the most important aspects of pattern operation, patternmatching has played an important role in data warehousing, electronic commerce, dataintegration and many other areas. The thesis have a comprehensive analysis of the statusquo at home and abroad on the pattern matching, and complex pattern matching processis studied by pattern discovery and pattern matching angle, and the improvements of theTwig algorithm of pattern discovery. The thesis, using XML as a document vector, tomake the matches more portability and flexibility, easier to communicate betweendifferent systems research of this thesis as follows:1. Through the research database schema matching method, the status quo summed upits implementation, application scope, characteristics, and the basis for calculating thesimilarity pattern matching, the pros and cons of the pattern matching algorithmdepends largely on the final similarity computation situation. This thesis presents a newsystem of CMExt abstract from the different databases the pattern data, use the kettle todo data cleansing, and then to read and write data into memory and then to improve theaccuracy of pattern matching the similarity valuation module to improve the existingCM system.2. Analyze and compare the matching techniques based on pattern matching techniquesand match the technological achievements based on the characteristics and structure ofthe XML document tree data model, this thesis make the current major mode, based ondata names and data fields linguistic matching and the context of model-based structurematching. Thus improving the accuracy of the similarity, to improve the efficiency ofthe mode matching technique based on unstructured data mode, a variety of similarity tomake the integration.3. This thesis analyzes and compares existing classic TwigStack of algorithms and itsimproved algorithm and found that the defects of these algorithms. Then through theTwigStack algorithm, a new TwigStackExt algorithm to solve the query processingcontains a parent-child relationship is not efficient enough and deal with the produceintermediate results when branch node with father and son side in the query process.4. To do the experimental test for the algorithm based on structured and unstructureddata mode and the TwigStackExt algorithm. And verify the validity of the proposedalgorithm.
Keywords/Search Tags:Pattern discovery, Twig pattern, Pattern matching, Similarity, Dataintegration
PDF Full Text Request
Related items