Font Size: a A A

Research On Key Technologies Of Building Web Dataspace

Posted on:2017-03-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z T LiuFull Text:PDF
GTID:1318330536968175Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technologies,Web,as a vast treasure trove of information,possesses mass data and becomes an integral part of such areas including people's daily life,e-government and e-commerce,etc.In order to effectively utilize the data resources of Web,there are already a lot of methods exist for Web data processing,such as Web data mining,Deep Web data integration,establishment of Semantic Web by reconstructing Web with Semantic technologies and so on.Dataspace is the abstract and summary for the new characteristics of data and new data management technologies,of which the essence is to solve the issues of data integration.Dataspace is the collection of all data owned by an organization or a person.Web dataspace system is a Web data integration system that can be sustainable,improved and can gradually achieve the Web semantic integration through the constructing method of Pay-as-you-go so as to realize all data access that are interested by users on the Web.The construction of Web dataspace system is aimed at providing individuals and organizations a solution on effectively utilizing Web data.This issue focuses on the study works including the system framework,data model,data source selection,mode integration,access control and other aspects as well.The specific research results are as follows:Firstly,a system framework and some construction principles of the Web dataspace are proposed.On the basis of the concept of data integration of dataspace,by combining the characteristics of Web data,Some of the key characteristics of Web dataspace and construction principles in building a Web dataspace system are presented: a Web dataspace system should be able to manage all data on the Web,to use the data management principle of Pay-as-you-go,to make full use of existing technologies,to use collaborative approaches and to have convenient data sharing methods.A system framework of Web dataspace is proposed which describes detailed functions of each part and finally discussed some of the problems in realizing Web space evolution by using explicit feedback and implicit feedback in detail.Secondly,a data model for the Web dataspace system is designed.Based on the RDF model,a data model of the Web dataspace is presented.Firstly,a data view was designed through RDF,which can help achieve the unity of all the data on the Web.The data model needs to be instantiated for specific types of data when using the data view,which includes Web page,files & folders,Deep Web,the data stream,and linked data as well.This model can achieve the modeling of all data on the Web;achieve the unified representation and access of non-structured,semi-structured and structured data within a single model.Thirdly,a selection method of Web data source based on the comprehensive consideration of inquiring the correlation with data source,the data quality of data source and the query cost of data source is proposed.This method is divided into two stages: the first stage is to select data source based on inquiring the correlation with data source and the quality of data source;the second stage is to select data source which has been selected at the first stage through the model with minimum query cost so as to meet the demands of users' k query records.In the designing of the algorithm of the model with minimum query cost,the maximum entropy model was used to calculate the Overlap among data sources.Fourthly,a mapping method of schema integration of Web dataspace is proposed.First of all,an integration framework of Web dataspace schema was obtained.Secondly,with the use of combination methods,the automatic integration of intermediate schema was realized on the basis of the K-medoids algorithm.Finally,the method of mapping and matching the user's queries by using Top-k schemas are put forward,which improves the accuracy and recall of user queries.Meanwhile,the method of using the Pay-as-you-go approach to improve query precision was also provided as well.Finally,a fine-grained Context-aware access control model is proposed.Based on the XACML model,a fine-grained access control model is completed which can control the access to the data of linked data global dataspace with the use of some of the current semantic technologies: Using OWL to describe the operation,the environment,using SWRL to realize the semantic reasoning.The definition of semantics scope in the model can greatly reduce the definition of access rules.The use of relevant semantic technologies can achieve the access control on the data context of dataspace.
Keywords/Search Tags:Web data integration, dataspace, Pay-as-you-go, architecture, data model, Web data source selection, schema integration, semantic access control
PDF Full Text Request
Related items