Font Size: a A A

Performance Evaluation And Prediction Analysis Of Information Retrieval Systems

Posted on:2020-06-26Degree:MasterType:Thesis
Country:ChinaCandidate:Z M ZhangFull Text:PDF
GTID:2428330596996920Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Since the 1990 s,the Web search engine technology has been evolving all the time.It has become an important portal of obtaining information for billions of people around the world.In information retrieval,retrieval results evaluation is an important task,because the usability of a search system is largely determined by how well the retrieved documents satisfy users' information need.Related to this problem,this thesis has done several pieces of works as follows:(1)Traditional search engines only consider relevance,but in recent years researchers find that diversity is also an important factor that needs to be considered for measuring the quality of search results,especially for some queries with broad and vague semantics.Many algorithms have been proposed to support the diversification of search results in academia.Therefore,we propose to learn about the application of the latest technology in major international commercial search engines.So we chose three representative search engines,Google,Bing and Ask,to evaluate their effectiveness in supporting of results diversification.Comparing these commercial web search engines with the best information retrieval systems in academia,it is found that the performance of the three participating commercial web search engines is excellent,and they are as good as the top search systems in academia.(2)Because accurate effectiveness evaluation of search results requires a lot of manpower,time and other resources,some fast and light-weight methods are very useful.If the search engine can automatically estimate the performance of the results before returning them to the user,it will be very helpful for improving the usefulness of the search engine.By means of analysis,we find some shortcomings of existing performance prediction algorithms,and propose two query performance prediction methods to support diversified results,which are Intent-Aware Prediction(IAP)and Intent-Covered Prediction(ICP).The experimental results show that the proposed methods have better prediction results and are more suitable for diversified results than the traditional query performance prediction methods.(3)Data fusion is a feasible method for building effective information retrieval systems.A random forest based performance prediction method is proposed in this thesis to predict fusion performance.Compared with a typical performance prediction algorithm-regression analysis,our proposed method is more effective by a clear margin.
Keywords/Search Tags:information retrieval system, performance evaluation, performance prediction, search results diversification, data fusion
PDF Full Text Request
Related items