Font Size: a A A

Research On The Relevance Of Data In DataSpace

Posted on:2014-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:L LiuFull Text:PDF
GTID:2268330422452535Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Recently, with the development of Web-technology and the popular of laptop,people are surrounded by mass data when enjoying the convenience from science andtechnology. Moreover, people are unable to distinct the authenticityand practicabilityof data, either whether the data has connection with him or her. DataSpace as a newidea aims to manage mass heterogeneous data which is related with user. It alsoprovides searching and query functions with easily operations.The property of multi-source and heterogeneous in DataSpace makes the researchmore important and difficult, which is different with single traditional structuredrelational database. Because of a large number of the data, there is some relationshipbetween each data. It is very significant to study the relationship of heterogeneousdata and also can help user manage and access the data. Excavating the relevanceofheterogeneous data means finding the connection between two different data; it notonly provides user content-rich search service but also supplies the foundation of dataspace for advanced integration. Therefore, we should consider about relevance ondata structure and data content. For personal DataSpace, we also should consider theinfluence of user activity on heterogeneous data.In this paper, we study the relevance of data under the foundation of DataSpace.The primary work includes:(1)we propose a novel algorithm to provide an easy,precise andrapid access to mine the heterogeneous data. The inherent associations ofdata source and user activityrelevance are considered to compute the relevant set ofdata objects. Our experiment results using realdata sets demonstrate the promise andeffectiveness of the proposed algorithm.(2) As multi-source heterogeneous data mayhas the problem of many entity references mean the same entity or the same nameentity point to the different entities in real world, we proposed a two-stage unifiedEntity Resolution model and algorithms.(3) We propose a data query and indexingsystem to providekeyword searchingwith a simple operation and returning thecontent-rich, and data browsing. Because of the complexity,chaos and randomness of the DataSpace, the datarelevance becomes complicated and the research on data relevance is the key point ofDataSpace. So the research of data relevance we proposed provides the foundation offurther data processing.
Keywords/Search Tags:Data Space, Heterogeneous Data, Entity Resolution Search
PDF Full Text Request
Related items