Font Size: a A A

Research On Diversification Method For Search Result Based On Sub-intent Recognition

Posted on:2013-06-18Degree:MasterType:Thesis
Country:ChinaCandidate:J B GaoFull Text:PDF
GTID:2268330392467979Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The rapid development of the Internet makes the amount of informationincrease with an exponential trend. In order to enable users to find the informationrelated to their needs more quickly and accurately from the vast amounts ofresources, information retrieval technology was born. At present, the search enginehas become a very important application of information retrieval which can’t bediscarded from people’s daily work and life. It retrieves the queries submitted byusers and used to represent their intents. Then returns and ranks the relateddocuments according to the similarity scores. However, the same query mayrepresent different intents for different users. There are two main reasons for thisresult. One is the query may be ambiguity, and the other is one query may covermultiple sub-intents. Therefore, only considering the similarity when retrieved can’tmeet certain users’ needs. The search results should consider the diverse needs ofusers. In order to meet the diverse needs of users, in this paper, we study thediversification problems for search result, and propose a diversification methodbased on sub-intent recognition. This method takes both relationship between users’intents and documents and diversity among returned documents into account.The diversification method based on sub-intent recognition in this paper comesfrom the traditional explicit and implicit diversification methods. It acquires explicitcoverage of different sub-intents hidden in original query just as explicitdiversification methods did and reducing the redundancy of returned document setwhich was a common way in implicit diversification methods. There are three mainaspects in our study. Those are how to identify different sub-intents, how to predictthe weight order of different sub-intents, and how to make use of differentsub-intents with weights to diversify the search results.In detail, this dissertation has conducted into following researches:1. Mine the sub-intents of original query explicitly. We take the Relatedqueries and Suggested queries given by the commercial search engines ascandidate sub-queries of the original query. Then divide them into differentkinds of sub-intents by artificial marking. It can be illustrated that theperformance of this method is better by comparison with other threemethods.2. Predict the weights of different kinds of sub-intents. We extracted out32features related to sub-intent by mining the browser users’ logs of6months.Then we put weight prediction for different sub-intents into practice with SVM ranking model.3. Analyze the diversification problem and propose a diversification methodbased on sub-intent recognition and show the general process of thealgorithm. The method is proven to be effective by comparing with theupper performance of traditional explicit/implicit diversification methodsand another kind of variant explicit diversification method. Meanwhile, weanalyze the relationship between the performance and the number ofsub-intents for this algorithm.The performance of sub-intent mining method we used is good whenexperimented on the data collection of sub-intent mining task in NTCIR9. It laid thefoundation for other work related to sub-intent. From the compared experimentswith other diversification methods on the data collection of document ranking taskin NTCIR9, we find that the diversification method based on sub-intent recognitioncan produce diversified results to better meet different users’ needs.
Keywords/Search Tags:Information Retrieval, Sub-intent, Weight Prediction, Diversification
PDF Full Text Request
Related items