Study On The Evaluation Of Performance Of Search Engines' Features

The study of evaluation of search engine is one of the popular issues in the field of information retrieval. The development of Internet information and technologies of information retrieval accelerates the development of search engine. Besides simple search, search engines have developed many other advanced search features. These features are at the aim of helping users to find the information they need, but as the matter of fact, the performance of these features is still a puzzle. This study takes relevance and ranking quality of retrieval results as two key indexes to evaluate the main search features of popular search engines. The findings of this study can be used to assist users in formulating an appropriate search strategy to improve search effectiveness, and to shed light on how search engines react to different types of search features in terms of retrieval effectiveness.In the first chapter, the author discusses the research status of the study of search engine and its evaluation and summarizes the content, methods, characteristics, deficiency, and the development trend. At present, the relevance is the core content in the evaluation of search engine. Experimentation, observation, investigation, data analysis, and review are main research methods. The study of evaluation of search engine has the characteristics of dependence, dynamic, diversity, emphasis on users' participation and so on. There is less finding which compares the search effectiveness between different search features, as well as the ranking quality of search results. As the development of multimedia information, the study on the evaluation of performance of multimedia retrieval features will become part of the hot research issues.In the second chapter, the author points out that relevance, which is the basic index, derives the index of ranking quality of search results. Relevance can be judged based on the form and content of retrieval web pages. And the ranking quality is decided by the sequence and stability of search results. Based on the two key indexes, the author sets up an evaluation system. Following the standards, five search engines and five search features are selected respectively. There are three English search engines, which are Google, Yahoo! and Bing, and two Chinese search engines, which are Baidu and Google China. Five search features are title search, phrase search (exact search), PDF file format restriction search, URL search and regular search, and the results from a regular search can serve as a baseline for comparison and analysis in a search engine.In the third chapter, the overarching research question for this study is whether the use of advanced search features would enhance retrieval effectiveness in a search engine. Based on the research question, some null hypotheses of the study are developed. The author selects some indicators, which are full text, abstract, title, the validity of web page, user's burden and the length of web pages, to evaluate the relevance of search results based on the methods of Analytic Hierarchy Process (AHP). A revised relevance is used to evaluate the effectiveness of a search feature. A one-way ANOVA analysis method is applied to whether there are significant differences among the effectiveness of search features. If there are significant differences, the Tukey analysis method is used to detect what causes the significant differences. The regression analysis method is applied to detect the sequence and stability of ranking of search results.In the fourth chapter, a one-way ANOVA analysis method is used to evaluate theeffectiveness of five search features of five search engines based on the 50,000 date. The findings show that there are significant differences between search features, so the Tukey analysis method is used to detect the cause of the significant differences. Among these search features, PDF file format restriction search achieves the best retrieval effectiveness. Yahoo! achieves the best retrieval effectiveness among three English search engines in all search features. And Google China gets better retrieval effectiveness than Baidu at title search, regular search, PDF file format restriction search and URL search, but there is no significant difference in phrase search.In the fifth chapter, the regression analysis method is applied to analyze the ranking quality of five search features of five search engines. The regular search achieves the best ranking quality among the search features of English search engines, and URL gets the worst ranking quality among all search engines. PDF file format restriction search achieves the best ranking quality within Baidu's five search features. Correspondingly, title search achieves the best ranking quality in Google China. Obviously, the ranking quality of search features of Chinese search engines is lower than English search engines.In the sixth chapter, the author finds that Chinese search engines achieve worse results both in retrieval effectiveness and ranking quality during the process of data collecting and analyzing, The author puts forward to some optimization strategies to improve the development of Chinese search engines in retrieval effectiveness and ranking quality. We should pay much attention to the quality of web pages at very beginning of creating them. And open access should be propelled in China to improve the quality of Chinese web resources. Search engines can develop some powerful features to filter particular search results that users have no interests, and should be cautious when operate the policy of bid ranking.
Keywords/Search Tags:search engine, information retreieval, evaluation, relevance, ranking, optimization
