Font Size: a A A

Research On Optimization Of Indexing Algorithm In Full-text Retrieval

Posted on:2015-11-09Degree:MasterType:Thesis
Country:ChinaCandidate:J P HeFull Text:PDF
GTID:2298330422471526Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Since the21st century, with the development of information technology, variousdata in enterprises are stored digitally. How to quickly retrieve the required informationfrom the mass of information is a problem worthy of study. Currently, the vast majorityof information retrieval systems are based on full-text retrieval model. Inverted index isa core of technology in full-text retrieval technology. Its structural design, storagemethods, and dynamic updating algorithm directly affect the performance of full-textretrieval system. So optimized for full-text retrieval system has great significance.Firstly, this paper analysis the architecture, the major component and keytechnologies in full-text search: document storage, content segmentation techniques,retrieval models, and index organization. Then it studies principles and methods aboutindex creation, it includes index storage, index update, index delete, and index query.Based on that, this paper designs a content segmentation technique which canimprove accuracy and efficiency through introducing redundancy and weight. Itanalysis a B-tree inverted index which has four aspects: main index, segment index,delete index and dictionary library. It designs a new index update method which canaccelerate the speed of index update. It designs a new index entry which has two aspects:words information and location information. We put the location information in stack,so it can reduce storage space. It designs fill update method which can resolve theproblem of B-tree’s overflow problem,and this method also can find time-efficient andspace-efficient equilibrium point. It designs a incremental encoder to reduce the storageof index. At last, we write program to realize these methods. By comparing it withlucene, we can verify our methods’ efficiency and feasibility.
Keywords/Search Tags:Full-text retrieval, Inverted index, Optimize
PDF Full Text Request
Related items