The graph is an important data structure in computer science and technology. With the continuous development of computer science and technology, there are more and more pictures to show the logic expression data, such as social networks, protein interaction networks, and molecular chemical formula. Moreover, the amount of data of these plans continue increase, for example, up to tens of millions of registered users on social networking sites or even hundreds of millions, resulting in a large-scale Internet-based social network data. How to effectively manage these data has gradually become a popular research in the field of information technology. The operation of the graph data, including:(1) how to establish effective storage mechanism and indexing strategies,(2) how to query efficiently in the mass graph database.Sub-graph query in the graph database is one of the basic operations of the graph data management. Sub-graph query is defined as:given a query graph Q and a data graph G, all the sub-graphs will be found in the data graph, which are matched with query graph. The majority of the current sub-graph query algorithms are sub-graph isomorphism algorithm; sub-graph isomorphism is a typical NP-complete problem. The query efficiency is very low. In order to improve the efficiency of the sub-graph query algorithm, this article presented a way that based on the vertex encoding data to create a data graph index. Besides, this article presented a sub-graph query algorithm which is based on this data graph index. First, this article is to prune the index structure, and then based on the information of query vertex to search the matching vertex. Second, this article is search the sub-graph in the vertexes matching which are matched with query graph. Because the index structure based on vertex encoding is creates based on data graph which is a unit. The index structure is not easy to change and the index of table memory space is larger. In order to reduce the index space, this article improves a sub-graph query algorithm based on the code block. The basic idea of this algorithm is:the data graph is divided into number of small partition, then search for the sub-graph among this small partition. There is a possible that produces a match out of bounds, so we store the crossing edge. And during the query process, if cross-vertex of the edge at both ends of the query vertices match with the traverse cross-edge index, then we choose the ends of the cross side of the vertex as a starting point for depth-first traversal and the depth is the depth of the query graph. A new block is formatted and then put into the data map index. We call the matching sub-graph matching query algorithm based on a sub-graph of the vertex encoding and then output the query graph.Extensive experimental studies demonstrate that the method of creating index in the text of the algorithm is relatively simpler, with the higher utilization of space. It is suitable for mass storage of the data graph and sub-graph query huge amounts of data graph. |