Font Size: a A A

The Research And Application Of Big Data Retrieval Organization Based On Neo4j

Posted on:2016-09-27Degree:MasterType:Thesis
Country:ChinaCandidate:P LuFull Text:PDF
GTID:2308330503977805Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the increasing scale of data, the bottleneck of traditional relational database RDBMS in terms of scalability makes people contused about what to do:how to store, retrieve and analyze deep relationship between big data have become problems which cannot be ignored. As a non-schema data storage model, graphics database is good at dealing with large number of complex, interconnection and less structured data, as well as using graph theory for a variety of complex mathematical calculations and mining relationship between data. Through research secondary graphics database Neo4j, this research designs and implements a scheme to retrieve and mining data based on Neo4j. Main works are presented as follows:Firstly, the graphics database Neo4j is put forward in order to solve the bottleneck of traditional relational database in terms of scalability. In view of the bottleneck of traditional relational database RDBMS in terms of scalability, this research studies the performance of the graphics database Neo4j itself, organizes all the data into a diagram form and uses graph vertices and edges to respectively show the relationship between entity and finally stores multi-source heterogeneous data through the analysis of point and line.Secondly, based on the Neo4j database organization, the full text retrieval scheme of Neo4j is raised in order to solve the problem of the relatively low precision of TOP-K. Neo4j first extracts the corresponding properties, builds a full-text index after the Chinese word segmentation, introduces the vector space model and Lucene technology at the same time, finally calculates similarity and achieve the goal of search results.Finally, the graph database data Neo4j including data organization module, data retrieval module and data analysis module is finished by finding the relationship between data, which consist of smart recommendation, influence analysis, clustering analysis and path analysis. At the same time, the function test and performance test are used on the retrieval and mining system based on Neo4j, and the test results show that compared with the traditional relational database, this scheme has good performance and provides another solution for the organization analysis of large-scale data.
Keywords/Search Tags:Neo4j, Lucene, Cypher, Big Data Analysis, Association Relationship
PDF Full Text Request
Related items