Font Size: a A A

Research On Data Model And Index Technology In The Cloud

Posted on:2014-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:C J SunFull Text:PDF
GTID:2248330395484135Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of computer and Internet technologies, the amount of data hasexpanded rapidly. Traditional data model and index technology have been unable to satisfy therequirements of massive data management, which becomes a major challenge for traditional datamanagement. As a new computing platform, cloud computing has attracted wide attention fromacademia and the business community. And it has become an important field to research data modeland index technology based on the characteristics and requirements of cloud computingenvironment. The main contributions in this thesis are described as follows:(1) The basic concept, characteristics and development of cloud computing are summarized inthis thesis, and then the existing data model and index related technologies in cloud environment aresummarized and analyzed.(2) Typical key-value data model in cloud environment cannot effectively support the user’svarious queries, such as range query and non-primary key query, therefore, a new data model:Key-MultiValue is proposed in this thesis. Key-MultiValue can support non-primary key querythrough partitioning the value and changing the attributes partitioned dynamicly according to thequery hotspots. On the other hand, P-Ring structure is adopted to partition the data, which caneffectively support range query. Moreover, node performance state parameter is introduced in P-ringto solve the shortage that it does not take into the difference in performance of the each storagenode itself. Finally, the experiment and result analysis show that the data model can effectivelysupport range query and non-primary key query, and it also improves the success rate of query andquery throughput.(3) Cloud computing systems cannot effectively support similarity search due to lack ofefficient index structures, and with the increase of dimensionality, the existing tree-like indexstructures could lead to the problem of “the curse of dimensionality”. In this thesis, a novelVF-CAN indexing scheme is proposed. VF-CAN integrates CAN based routing protocol and theimproved VA-File index. There are two index levels in this scheme: global index and local index.The local index VAK-File is built for the data in each storage node. VAK-File is the k-meansclustering result of VA-File approximation vectors according to their degree of proximity. In theglobal index, storage nodes are organized into an overlay network CAN, and in order to reduce thecost of calculation, only clustering information of local index is published to the entire overlay network through the CAN interface. The experimental results show that VF-CAN reduces the indexstorage space and effectively improves query performance.
Keywords/Search Tags:Cloud Computing, Data Model, Key-value, Index Structure, K-means Clustering
PDF Full Text Request
Related items