Font Size: a A A

Design And Realization Of Multi-Service Vertical Searching Engine Frameword In Enterprises

Posted on:2019-09-01Degree:MasterType:Thesis
Country:ChinaCandidate:F D ZhengFull Text:PDF
GTID:2428330590992420Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Nowadays,vertical searching engine has become one of the more indispensable technic in enterprises.It helps companies to provide information searching service in a certain area.But with the development,more and more enterprises transfer single business pattern to multiple ones.For example,online traveling enterprises will have scenic spots tickets,hotels,tourist routes,travel strategy and so on.The product features differs from each other,which leads to the variety of searching itself.As results,it's becoming urgent for enterprises to build up a vertical searching engine of different business lines efficiently.This thesis provides a vertical searching engine framework based on Lucene.It can extract data,build up indexes,search keywords and Number type field and complete statistics,which lead to lower admittance to build up searching engine.Under the help of this framework,developers who don't really understand the theory of searching engine are able to build up one with no worries.My main work and contributes are as following:1.A configuration design method is proposed.In order to build up a flexible and efficient business vertical searching engine,the configuration of index data source,searching filed,segmentation dictionary and correction dictionary is presented.2.An optimized range search method for numerical class fields is designed and implemented.A certain forward table is designed in view of Number field.When searching condition contains both keyword searching and Number type range searching,we can fetch document number according to the keyword and get the related value by forward table.Since then,final result will be shown after filtering.This method is more optimal than reverse table on performances.3.The function of field statistics is designed and implemented.In Lucene,statistical function on fields is not fulfilled.But in this thesis,it can be done based on FiledCache and Array data structure,ensuring that the statistical performance can be completed in milliseconds within the search result of million levels.4.An automatic optimized query expression solution is proposed.When caller is searching through expression,execution efficiency will differ from developers.We can figure out low-efficiency expression and rewrite by means of analyzing the searching expression the caller passes in.5.A searching logic sharing method among multi-business engines is designed and achieved.In order to achieve the sharing,a searching method is re-designed.With the help of router mechanism,it can be assured that logics of vertical searching engines in different business will be both independent and sharable.6.We have tested searching algorithm,field statistics and the optimization of searching expressions from two actual cases of building up hotel business searching engine and building up hotel comment searching engine in this thesis.We can find that the corresponding speed of searching number scale is five to ten times faster than the original Lucene methods.The average corresponding time of field statistics under 1.5 million index quantity is about 25 milliseconds.Instead of the past 3 seconds,after optimizing searching expressions,the corresponding time shrinks to 13 milliseconds.All the results go extremely fascinating.
Keywords/Search Tags:Search Engine, Numeric Field Search, Statistics, Inverted Table, Forward Table, Lucene
PDF Full Text Request
Related items