Font Size: a A A

The Construction Of Inverted Index Based On Mongodb

Posted on:2015-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:X K LiuFull Text:PDF
GTID:2268330428485630Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rise of Web2.0and cloud computing, more and more enterprises choose nosqlas their application framework. At the same time, as a typical representative of thenon-relational database, mongodb is being used to deal with massive data storage in theframework by more and more people. As a result, the number of applications based onmongodb is getting larger and larger. Full text indexing retrieval is one of the mostbasic,typical applications, and the construction of inverted table is just one of the coretechniques of buiding full text indexing.This paper aims at exploring a method to establish a kind of inverted table for Chinesefull-text retrievaling based on mongodb storage. The process uses a reduced version of sougo30M classified text corpus as the experimental data,and simplely removed the punctuation,special symbols of more than1900documents in one of the classifications. Then Chineseword segmentation is proceeded on the documents, and the inverted table is directly (withoutconsidering the word segmentation results of stop word elimination, Term normalization, tomerge, a reduction of keyword extraction operation such as the establishment of theprocessing of the Thesaurus) constructed.It was realized based on mongodb storage forfull-text search, and actually I used mongodb`s MapReduce module to archieve a simplepartition, merge strategy to the construction of the inverted table, finally achieved theanticipated target, and the experimental process were recorded, the experimental results wereanalyzed.
Keywords/Search Tags:non-relational, mongodb, inverted table, mapreduce
PDF Full Text Request
Related items