Font Size: a A A

Max-Sum Internet Search Result Diversity Problem And Its Greedy Strategy

Posted on:2017-09-08Degree:MasterType:Thesis
Country:ChinaCandidate:X R LiFull Text:PDF
GTID:2348330485488160Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the development of net-technology, the number of Internet users is increasing, internet information resources are growing rapidly. the search engine, which is an important retrieval tool to obtain effective information, is favored by more and more people. In actual retrieval process, user experienced convenient brought by search engines, but plagued by the flood of information. To locate the effective information, users spend a lot of time to screening Information, selecting or refining search contents in the huge collection of information. on the one hand, Users have lazy characters in terms of "keywords length" and "browse length" when they search information what they need, but always ask the system locate the exact information in the first few pages through given a few keywords. On the other hand, Keyword itself has a wide range of semantics, and the number of results the search system presented is limited. As a result, the system usually can not locate the exact user's search requirement. A large number of search results not only waste a lot of time, but also reduce the efficiency of searching.This paper considered the problem of rendering web search results from the point of improving the efficiency of search engines and retrieval satisfaction of users, which include the relevance of keywords and diversity between the individual results. At the same time, we analyzed and proved the approximation ratio of the greedy algorithm.The main research work and conclusions of this paper are as follows:First, our article considered the correlation between result subsets and keywords, the diversification between the results. On this basis, we established mathematical model about this problem and came to conclusions about the algorithm result set of the problem by solving and analysising the model. After compare with the optimal results set, we demonstrated the effectiveness of the algorithm. At the same time, we analyzed our greedy strategy has better approximation performance.Second, users find the desired information through filtering the search results set, the process of screening information is browsing information. Overall correlation between result sets and the key words is proportional to the number of browsing, but the marginal satisfaction of each message to the users is a decreasing trend. In response to this situation, we use non-negative monotone submodular function metrics the relevance of the problem, built mathematical model, designed greedy algorithm, discussed web search results on the condition of monotone submodular function, and showed that approximate greedy algorithm has excellent performance ratio.Third, information changes very fast in the Internet world, as for the same keywords, the needs of the same user may be different at different time points. In order to meet the changing information needs, our article will examine the network information that has dynamic characteristic through adjusting elements of the static result sets.Fourth, in order to verify the effectiveness of the algorithm, we made simulation experiments according to the algorithm, which showed that the greedy algorithm has a good approximation property.The research and conclusion of this paper is beneficial to enrich the existing research on the diversity of search results, lays a theoretical foundation for the further exploration of the relevant issues. At the same time, it has certain theoretical guidance value to the actual search results diversity problem.
Keywords/Search Tags:diversity, keywords search, greedy, strategy, approximation ratio
PDF Full Text Request
Related items