Font Size: a A A

Research On Metadata Organization Approach For Image Storage Systems Towards Content-based Semantic Similarity Query

Posted on:2020-05-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y F LiuFull Text:PDF
GTID:2428330590958324Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
As the rapid growth of the data volume and its high complexity of content,storage systems,the backend of massive heterogeneous data,become more difficult to be managed.As an essential way to manage and analyze data,query operations face unprecedented challenges.Many of the query operations in storage systems rely on metadata to get results.The directory tree structure and simple metadata that are widely used in current storage systems cannot meet the requirements of the content-based semantic query,that is,users cannot effectively conduct queries based on the similarity of file content.Therefore,it restricts the functions of storage systems as well as data management and analysis.Based on this,this paper proposes a new semantic metadata organization approach named SwiftGraph to support fast and relatively accurate semantic query for storage systems,which fits large-scale data scenarios and applications.SwiftGraph firstly extracts the binary and fixed-length semantic hash codes from the files or objects in the storage system as semantic metadata using a deep learning-based hash algorithm,and then it applies a graph-based metadata structure to organize the semantic metadata,which groups semantically similar file metadata into the adjacent area of the graph structure.SwiftGraph supports two semantic query operations,semantic range queries and semantic top-k queries.SwiftGraph is implemented on an open source cloud storage system OpenStack Swift;moreover,as a middleware,SwiftGraph can be applied to any file and storage systems as an independent branch without affecting the metadata organization and functionality of the local system.According to the test results on three image datasets,SwiftGraph not only extracts accuracy metadata for content semantic representation but also significantly reduces the time cost of semantic queries compared to tree structures.Furthermore,although the semantic metadata from SwiftGraph brings extra time and space overhead to the storage system,it has been found that this overhead has little impact on system performance through experiments.In the case of large-scale datasets,the expansion of data volume has little effect on the query time of SwiftGraph,which proves that SwiftGraph has high adaptability and scalability for data growth.In general,SwiftGraph not only proposes an approach to support content-based semantic query in image storage systems through semantic metadata but also provides an effective and feasible solution for intelligent services and big data analysis of storage systems.
Keywords/Search Tags:Metadata Management, Storage Systems, Deep Learning Hash, Semantic Hamming Graph, OpenStack Swift
PDF Full Text Request
Related items