Font Size: a A A

Semantic Query Processing And Query Expansion In Web Search

Posted on:2015-01-14Degree:MasterType:Thesis
Country:ChinaCandidate:G HeFull Text:PDF
GTID:2268330431461233Subject:Systems analysis and integration
Abstract/Summary:PDF Full Text Request
With more and more requirement the people demanded on information retrieval, the search engine based on the inverted index and word matching technology cannot satisfy people with the drawback of errors and omissions in search results. Part of the reason is that most of the queries user inputted is short and they cannot express the query intent precisely. Thus, query expansion, query recommendations have been become the object of study in information retrieval. Whether using the structured knowledge or external corpus or kinds of context, the query expansion based on each of these data all had corresponding defect. This paper proposes an expansion method based on random walk model, and analyzes the automatic query expansion for meta search. In addition, most of the studies in information retrieval may have overlooked a problem that sometimes users do not know what query should be submitted to obtain information they want through the search engine, and many time, the search engine does not understand the user’s query intent. This paper presents a semantic logical oriented query processing method based on HDWiki knowledge, to take full advantage of the characteristics of ternary data to process the query with the semantics of logic. Innovation of this paper including:1) Proposed a semantic logic oriented query processing method based on HDWiki Knowledge. The structured knowledge of HDWiki is devided into three categories: instances, relationships and terms. Using triple relationships among these three types of knowledge, combined with the semantics logical symbols to help users construct clearer query, and to deal with the similar logic which the general search engines cannot handle. With an implemented prototype search system, the user study and experiment shows that the HDWiki knowledge abstraction based on DOM has an accuracy of90%, and the Top-10precision of oriented search is6%higher than general search.2) Proposed an automatic query expansion method based on the random walk model. Combining variety of associations from semantic and lexical, including word co-occurrence in general corpus and retrieved Top-N documents, synonyms, and hyponymy in the classification tree of HDWiki. The experiments among random walk method in different combinations and pseudo-relevance context method based on TF-IDF technology shows the automatic query expansion based on random walk model combined with four associations between terms have higher efficiency and robustness that, it have a comprehensive assessment value F which is about eight percent higher than unexpanded retrieval, and it guarantee the improvement of both precision and recall compared to the pseudo-relevance context method.3) In the aspect of query diversification, classifying the documents retrieved with different semantics of user query based on HDWiki knowledge, and displaying these classified documents together with their general summary which is abstracted by virtue of automatic summary technology to help users find the information quickly.
Keywords/Search Tags:Query Processing, Query Expansion, Semantic Logic, Random Walk, HDWiki knowledge, Search Diversity
PDF Full Text Request
Related items