Font Size: a A A

Research On Classification And Integration In Fuzzy XML Data

Posted on:2018-02-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z ZhaoFull Text:PDF
GTID:1368330572965499Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years,with the development and application of database technology,it has been becoming an important goal of database managements to cross the obstacles between different data sources and realize the free query between different data sources.The objective of data classification and integration theory is automatic classification and integration processing of data coming from different data sources.The goal is to manage data efficiently,achieve a unified query and then obtain valuable information from query results,which meaning is very profound.At the same time,with the rapid development of Internet technology,XML has become the de facto standards of the Web data representation and exchange.Now XML has been playing a vital role in various the Web applications.However,in real world applications,many fields contain a large amount of fuzzy information.Fuzzy information can be represented as fuzzy XML data on the Web.With the emergence of a large number of fuzzy XML data,the classification and integration of fuzzy XML data is a key requirement for the Web data management.There are a number of practical studies on the classification and integration of XML data coming from multiple data sources,but there are few researches on the classification and integration methods of fuzzy XML data.Due to the fuzzy characteristics of fuzzy XML data.the existing methods cannot be directly used for the classification and integration of fuzzy XML data.Therefore,the classification and integration of fuzzy XML data is an important research topic in the field of Web data management.The classification and integration of fuzzy XML database are investigated in the thesis.Three issues are discussed,including the similarity of fuzzy XML data.the classification of fuzzy XML data and the integration of fuzzy XML data.Specifically,the main contributions of this thesis are summarized as follows.(1)To deal with the similarity problem of the fuzzy XML data.the similarity comparison algorithm is proposed in order to compare the similarity of fuzzy DTD,the similarity of fuzzy XML document.and the similarity between fuzzy DTD and fuzzy XML document.The correctness and effectiveness of the proposed method are verified with experiments.In order to effectively compare the similarity of fuzzy DTD,a new tree representation model named FXDT is proposed to capture the fuzzy feature information in fuzzy DTD.The corresponding node feature similarities are calculated with these node characteristics.An approach based on extreme learning machine is proposed to synthesize various independent similarity and obtain the node semantic similarity.In order to compare the similarity of fuzzy XML document.a new fuzzy XML document representation model named FXTM is proposed to capture the structure and semantic information of fuzzy XML document.A calculation method for various node similarity is proposed.Then the structure similarity comparison of fuzzy XML document is realized by using the improved method based on the tree editing distance.In order to compare the similarity between fuzzy DTD and fuzzy XML document.the disjunctive constraint and cardinality constraint transformation rules of the fuzzy DTD model are proposed,and a fuzzy DTD tree transformation algorithm based on these rules is proposed to convert fuzzy DTD into corresponding tree.(2)To deal with the classification problem of fuzzy XML data,based on kernel function extreme learning machine,an KPCA-KELM fuzzy XML document classification framework is proposed.Firstly,after using fuzzy XML document tree model FXTM to represent fuzzy XML documents,an improved vector space model MS-VSM is proposed to represent the semantic and structure of fuzzy XML document in order to improve the expression ability of fuzzy XML semantic and structural information.Secondly,an KPCA-KELM algorithm is proposed,and the fuzzy XML document is classified by using the kernel learning machine after the feature extraction with KPCA.Finally,the performance of the KPCA-KELM algorithm is compared with the current algorithms in the literature.The experimental results show that the KPCA-KELM algorithm has outstanding performance advantages.(3)To deal with the integration problem of the fuzzy XML data,an integration framework for fuzzy XML data is proposed.Firstly,the fuzzy XML document tree model FXTM is used to represent fuzzy XML document.Secondly.based on tree edit distance.an efficient algorithm is proposed to identify the structure and semantics similarity between fuzzy XML documents represented by the FXTM model,and the entity recognition of subtree level is realized.Thirdly,an integration strategy is proposed to integrate fuzzy XML documents from different data sources,and the corresponding integration algorithm named IFXD is given.The effectiveness and efficiency of the proposed algorithm are demonstrated by a series of experiments.
Keywords/Search Tags:Fuzzy XML, Classification, Integration, Structural, Semantic, Similarity, Heterogeneous
PDF Full Text Request
Related items