Font Size: a A A

A Prefetching Method For Dataspace

Posted on:2015-06-27Degree:MasterType:Thesis
Country:ChinaCandidate:D LiuFull Text:PDF
GTID:2348330518970440Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Currently, data management presents massive, heterogeneous, distributed, shared and other new features. Traditional data management system has been unable to meet the needs of today's users. Therefore, Michael Franklin et al proposed a new thought of data management-dataspace. Whereas how to provide efficient query and search service for user is one of the major challenges faced by dataspace. Thereinto network latency and query processing delay are important factors affecting the search and query performance in dataspace. An effective solution to the current delay problem is prefetching technology. Its main idea is to analyze the user's data access characteristics, and prefetch data most likely being accessed by user into buffer ahead of user accessing, so as to decrease the access delay.Therefore, this paper introduces prefetching technology to improve search and query efficiency of dataspace.However, the research of prefetching method based on dataspace needs to take into account the following questions. (1) Heterogeneous data sources. When the dataspace user submits a query, results returned by the system may contain a variety of data types. So the prefetching method based on dataspace can't only in view of a single type of data. (2)Comprehensiveness of data prefetching. The dataspace-based prefetching method should not only consider the prefetching strategy after user submiting queries, but also the strategy before user submiting queries. (3) Accurate identification of query intention. Each query submitted by dataspace user may contain several different intentions. How to accurately identify the current user's query intention and return the data that the user most needs in the fastest speed will have a great impact on query efficiency and user's satisfaction.Considering the above problems, the paper makes research on dataspace prefetching methods from two aspects. (1) When the user has not submitted any queries, we put forward an initial prefetching method based on dynamic popularity. Firstly, the dataspace log is used to cluster users with similar interests, and then the dynamic popularity of query words in every cluster is calculated. After that, combined with TF-IDF algorithm and extended inverted index in dataspace, the dynamic popularity of every entity object is calculated. Finally, by determining which cluster the user belongs to, the top-ranked entities in the cluster are chosen as the initial prefetching objects. (2) When the user has submitted a query,we put forward an prefetching method based on user's query intention. By identifying the user's current query intention, the method can predict data most likely being accessed by the user. The main process is as follows: intention feature extraction, search logs clustering, intention extraction,identification of user's query intention, data prefetching.The experimental results show that two prefetching methods proposed by this paper can both significantly improve the dataspace query efficiency. Furthermore, the approach combining these two methods is substantially better than either of the above prefetching method in the query performance.
Keywords/Search Tags:Dataspace, Prefetching, Dynamic popularity, Query Intention
PDF Full Text Request
Related items