Font Size: a A A

Based On The Research And Application Of Enterprise Search Engine Resorting

Posted on:2015-06-05Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ChengFull Text:PDF
GTID:2298330452450782Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Lucene can establish enterprise search engine, and search documents that sortedbased on the relevance of the query. But this is not entirely in line with the needs ofusers because the authority is the main feature in company, users belongs to differentsystem have different role, so when search the same query terms, they may want tolook the different results; documents that are clicked many times may in the topposition when user search again, users searched query terms may have a guide lineunder the other users in same role and so on.To meet these needs, in the searchprocess, we add some rating factor, so search out the results are in line with anadjacent relationship with the query words, the roles the user belongs to, clickrecord, and documents clicked belongs to which system.In the search results, it is necessary to consider the position of the query words inthe document, and analyzer the distribution of the words. And other factors need toconsider too. When the user uses the search engine and has some click data in thedatabase; we can analysis the user’s behavior regular offline. According to the clickdata and query words entered by the user, determine which documents are relevant tothe user, ideally that clicked data are entity relevant. Using the ListNet algorithm toget the factor weight, which based on search results’ new sort to calculate the weightof each query, and only consider top100documents.When the user uses the search engine, it is need to add the score rating factorscore, the weight value to calculate the score of the document, and use the heap sort toresort the search results and show. According to the user’s search and click, found thatthese scores factors are useful on the search results, and the results are more relevant,more in line with the needs of users. And using classification filter can make searchresults more specific. When the user searches the system, term position higher thescore, the results more relevant. When use used a period of time, which weights thatuser click and system click can increase.In the search engine, there have many other factors such as identification ofnames and places, and then find new words, making the results more accurate.
Keywords/Search Tags:reSort, lucene, termPosition, user behavior, ListNet
PDF Full Text Request
Related items