Font Size: a A A

Research On Ambiguity Of User Queries

Posted on:2014-10-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z C ZhengFull Text:PDF
GTID:1268330422460354Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Query Analysis is an integral part of an information retrieval system, because thesystem is not able to satisfy a user unless it understands what the user needs. The querymay be ambiguous, hence query analysis is a difcult problem. Diferent queries mayencounter diferent ambiguities, such as: the ambiguity of entity names in those for nameentities, the ambiguity of user intent in keywords queries, and the ambiguity of implicitfactors (time or location) in the queries. In this paper, we focus on analyzing these difer-ent kinds of ambiguities. The main contribution of the paper includes:(1) For name entity disambiguation, firstly, we propose a semi-supervised methodto solve the problem of lacking labeled data. We utilize the structured information ofentity databases to design a discriminative model for disambiguation. The experimentalresults show that the structured information improves the performance of disambiguationmodels, and the discriminative model performs better than the generative model in en-tity disambiguation. Secondly, assume that there is plenty of labeled data, we proposeto treat the disambiguation problem as a ranking problem, and adopt a learning-to-rankalgorithm to rank the candidate entities for an ambiguous entity name. The experimentalresults demonstrate that the proposed method achieves the best accuracy of name entitydisambiguation. Thirdly, after figuring out the referred entity of the query, we summa-rize the important information of the entity from multiple Wikipedia articles. With thesummary, the user could get the overview of the entity quickly. Compared with previ-ous methods which extract summary from one Wikipedia article, the proposed methodemploys multiple Wikipedia articles to better measure the importance of diferent con-cepts for the entity. The experimental results also prove that multiple articles improve thequality of entity summary.(2) In order to avoid the ambiguity of user intent caused by a keywords query, wepropose a user inquiry intent (UII) model to convert the keywords to questions (K2Q).With the recommended questions, user could express his intent more clearly. We auto-matically generate question templates from the questions in community question answer-ing (cQA) sites, then use these templates to generate candidate questions for those rarekeywords. In the UII model, we represent a question template as a slot sequence whichgenerates the word sequence of a candidate question. Then, we rank the candidate ques- tions according to the generative probability of the questions. The experimental resultsshow that the UII model performs well in the K2Q task.(3) Taking the time factor as an example, we make a study on the ambiguity of theimplicit factors in the queries. Firstly, we model the time sensitivity of the query by con-sidering both the content words and the context in the query. We compute the contextualtime sensitivities of diferent words in an optimization way. Then, we detect the time sen-sitive queries with the contextual time sensitivities. The experimental results show thatthe proposed method performs well in time sensitive query detection. Secondly, we clas-sify the time scales of diferent time sensitive queries according to their diferent temporalrequirements for the results. Then, based on the time scales, we design several temporalfeatures to improve the question ranking for the time sensitive queries. The experimentalresults demonstrate that the efectiveness of these proposed temporal features.
Keywords/Search Tags:Ambiguity of Query, Entity Disambiguation, Question Recommendation, Time Sensitivity
PDF Full Text Request
Related items