Font Size: a A A

The Research And Implementation Of Enterprise Search Engine Based On Lucene

Posted on:2019-07-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ShaFull Text:PDF
GTID:2428330566474301Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Thanks to the growing maturity of computer technology and the promotion of social science and technology,the combination of people's production and life with the Internet has become more closely.For enterprises,they also enjoy the advantages of these information and digital development.In other words,the digital and information-based office system determines the efficiency of the operation of the enterprise,and is an important support for the survival and development of the enterprise.Especially for some large and medium-sized enterprises,if we cannot make full use of these digitized information and transform them into available resources,we will greatly increase the production cost and reduce the production efficiency of enterprises.On the other hand,if digitalized and fragmented mass messages are applied properly,they will also become the data carrier and power source of new era.So how can it be effective and accurate for the full mining of information resources scattered in every corner of the enterprise? How can we let employees get rid of the complexity of searching for information,and enable office workers to accurately and quickly query information that accords with their concerns? The answer is search engine technology.To this end,more and more enterprises have introduced search engine technology into the demand of information platform construction,and the development of enterprise's search platform has become the wrestle point of all mutual network technology companies.In order to further understand the difficulties faced by the current enterprise information retrieval technology,this paper focuses on developing a full-text search engine system suitable for enterprises.After a lot of data and field research,we chose the open source Lucene as the core architecture of the design,combined with the excellent algorithm thought in the field of retrieval technology,and made the two development and expansion of Lucene.First,the current situation of enterprise search engine technology is investigated and analyzed at home and abroad.The related knowledge of learning search engine technology development includes inverted index technology,search engine technology,text parsing technology and information retrieval technology.Second,explore the essence and principle of Lucene work.Aiming at the inadequacies of Lucence's basic retrieval models,the improvement ideas from two aspects of document sorting algorithm and index structure optimization are presented,including the improved Lucene ranking formula and optimized Lucene index structure.Third,we design various modules of enterprise search engine,build the full-text search system platform of this research,and give the experimental data and results.
Keywords/Search Tags:Search Engine, Lucene, Chinese Word Segmentation, Text Parsing, Document Sorting Algorithm
PDF Full Text Request
Related items