Font Size: a A A

A Study Of Search Engine Click Model

Posted on:2017-10-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:C WangFull Text:PDF
GTID:1318330536458715Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Modern search engines record user interactions and use them to improve search quality.In particular,user's click-through has been successfully used to improve clickthrough rate(CTR),Web search ranking,query recommendation and suggestion,and so on.Although click-through logs can provide implicit feedback of users' click preferences,it is difficult to derive accurate absolute relevance judgments due to the existence of click noises and behavior biases.Previous studies showed that users' clicking behaviors are biased towards many aspects such as “position”(user's attention decreases from top to bottom),“trust”(Web site reputations will affect user's judgment)and so on.To address these problems,researchers have proposed a number of click models to describe user's practical browsing behavior and to obtain an unbiased estimation of result relevance.While these existing click models have achieved much success in click/relevance prediction for Web search and sponsored search,the limitations of these models have been more and more serious.With SERPs being more and more complex,an increasing number of search result pages(SERPs)are federated from multiple specialized search engines(called verticals,such as Image or Video),search user behaviors also become more difficult to be described by simple behavior assumptions(e.g.cascade assumption).According to our experiments,these models are not able to describe user's actuall behavior for modern search environment.Therefore,we should make further analysis of user's new search behaviors and then improve click models to describe such new behaviors.In this paper,we first try to analyze user's cognitive process in modern search environment.Then,we propose three new click models to improve existing click models in the following aspects: modeling heterogeneous SERP,modeling user's non-sequential search behavior,and combining user behavior information with search result content information.As for the serach engine user's cognitive process research,we design an experimental search engine to collect both the user's feedback on their examinations and the eyetracking/click-through data.To our surprise,a large proportion(45.8%)of the results fixated by users are not recognized as being “read”.Looking into the tracking data,we found that before the user actually “reads”the result,there is often a “skimming”step in which the user quickly looks at the result without reading it.We thus propose a two-stage examination model which composes of a first “from skimming to reading”stage(Stage 1)and a second “from reading to clicking” stage(Stage 2).We found that the biases(e.g.position bias,domain bias,attractiveness bias)considered in many studies impact in different ways in Stage 1 and Stage 2,which suggests that users make judgments according to different signals in different stages.We also show that the twostage examination behaviors can be predicted with mouse movement behavior,which can be collected at large scale.Relevance estimation with the two-stage examination model also outperforms that with a single-stage examination model.This study shows that the user's examination of search results is a complex cognitive process that needs to be investigated in greater depth and this may have a significant impact on Web search.As for the heterogeneous SERP research,we collect a large scale log data set which contains behavior information on both vertical and ordinary results.We also perform eye-tracking analysis to study user's real-world examining behavior.According these analysis,we find that different result appearances may cause different behavior biases both for vertical results(local effect)and for the whole result lists(global effect).These biases include: examine bias for vertical results(especially those with multimedia components),trust bias for result lists with vertical results,and a higher probability of result revisitation for vertical results.Based on these findings,a novel click model considering these biases besides position bias is constructed to describe interaction with SERPs containing verticals.Experimental results show that the new Vertical-aware Click Model(VCM)is better at interpreting user click behavior on federated searches in terms of both loglikelihood and perplexity than existing models.As for the user's non-sequential search behavior research,we investigate the problem of properly incorporating non-sequential behavior into click models.We firstly carry out a laboratory eye-tracking study to analyze user's non-sequential examination behavior and then propose a novel click model named Partially Sequential Click Model(PSCM)that captures the practical behavior of users.We compare PSCM with a number of existing click models using two real-world search engine logs.Experimental results show that PSCM outperforms other click models in terms of both predicting click behavior(perplexity)and estimating result relevance(NDCG and user preference test).We also publicize the implementations of PSCM and related datasets for possible future comparison studies.As for the combination of user behavior information and search result content information research,we propose a novel click model framework based on convolutional neural network architecture to make it more suitable for the ever-changing complex search environment.Compared with traditional probabilistic graphic models,our proposed framework not only uses user behavior information as input signals,but also adopts the result content information and take the relationships among different search results(the context information of results)into consideration.It properly processes content,context and user behavior information in different neural network layers to make sure that highlevel user behavior features will not be “buried”by large amount of content/context features.The proposed model also adopts parameters from existing click models as constraints for variables in the hidden layer,which guarantees the effective estimation of the examination probability and the user perceived relevance parameters.The framework can be adopted to reconstruct most existing click models and experimental results based on large scale practical user behavior data show promising results.State-of-the-art click models such as UBM and PSCM can gain significant improvement after reconstruction with the framework in terms of both click perplexity and NDCG.Parts of our research(VCM and PSCM)have been implemented in a Chinese commercial search engine and show promising improvement in this practical system.
Keywords/Search Tags:Click Model, User Behavior Analysis, Eye-tracking, Vertical Search, Conventional Neural Network
PDF Full Text Request
Related items