Font Size: a A A

Study Of Query Process For Heterogeneous Data Source Base On XML

Posted on:2005-04-12Degree:MasterType:Thesis
Country:ChinaCandidate:G J XiaoFull Text:PDF
GTID:2168360152469062Subject:Systems Engineering
Abstract/Summary:PDF Full Text Request
More and more application need to visit various kinds of heterogeneous data source,with the net application and the decision systems standfastly increasing. Now dataintegration has been an active topic. The realization of query is the difficult and importantpoint of the heterogeneous data integration. Though foreign scholars have already putforward some query languages based on XML technology at present, such as XML-QL,Xquery, Xpath, etc., one shortcoming that exists generally in these query languages ismiscellaneous grammar, making query can only be operated by some professionalpersonnel, and is yet at the stage of probing and studying. The grammar is miscellaneous,the semantic function is not the same, and there is no unified standard yet. In view of this,this text has put forward a kind of simple query language XSQL, and detailed definition isprovided. XSQL is expanded from SQL by using XML technology, so those who arefamiliar with SQL language can use it without special training. Based on XSQL, this text offers an effective solution to the heterogeneous data query:Users, according to global view, propose query terms, which is put into XSQL querylanguage template by application server, and then generate a query tree, which is divided bythe query processor into different sub-query supported by different heterogeneous datasources and sub-query are sent to the wrapper of the data sources, the wrapper isresponsible for sending query to heterogeneous data sources and demonstrating the returnstructure as common data model normal XML file, and then transferring to queryprocessor to be integrated. It maybe exist some data quantity problems such as two or more records are exactlythe same one, design modal is no good or data was wrongly inputted etc. The data must becleaned after the query processor integrating them. This text base on analyzing of slidingwindow algorithm and priority sorted-neighborhood algorithm, design an improvedwindow algorithm. This algorithm can improve the efficiency of data cleaning. Finally, the graduate education integrated information system and a prototype ofheterogeneous data source system that designed and implemented on above method andtechnologies, named XWDIS, is introduced. XWDIS, using XSQL as the language of query,can finish transparent query of a lot of different heterogeneous data sources. All work hassubstantial reference value no matter in theory or in practical application in the future.
Keywords/Search Tags:heterogeneous data source, extensible markup language, query process, data cleaning
PDF Full Text Request
Related items