Font Size: a A A

Research Of Query Expansion Based On Formal Concept Analysis And Term-Reweighting

Posted on:2012-06-19Degree:MasterType:Thesis
Country:ChinaCandidate:C WangFull Text:PDF
GTID:2178330335453152Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As the rapid development of information technology, the web pages are increasing at a surprising speed on the Internet. This provides large enough retrieving spaces for the users, however the problems that puzzle the users are how to find the information they factually need. Search engines as the online information serving system aid the users to look for the desired information conveniently.When users search for information they care, they submit a series of search terms to the search engine. However, the initial query terms can't be understood correctly due to the ambiguity of the natural language and many results returned by the search engine are not relevant to the user'intention, even deviate from the user topic. So how to solve the mismatch of the query is becoming a very important research topic in information retrieval. Query expansion is one of the effective methods to solve this problem. In order to solve the problem existing in query expansion, the paper makes a study on query expansion combining FCA (Formal Concept Analysis) and term weighting.The main job summarized as follows:1. The paper proposes a method of optimizing the query expansion source. The basic idea is: first, we analyze the documents chosen during user feedback and the documents not chosen by user but returned by the search engine. Then the knowledge of FCA is applied to the process and two types of lattices are established, named"User concept lattice"and"Mining concept lattice"separately. The following is a novel method is presented to calculate the similarity between concepts, finally we select the concepts with much higher similarities in the mining concept lattice and extract the extent of the concepts and add them to the initial documents. By this we achieve the purpose of optimizing query expansion source.2. A term-reweighting method for query expansion is proposed. The first step is presenting user's initial queries and the documents in query expansion source into vectors respectively, and the similarity is calculated and ordered by the similarity between the vectors. The second step is analyzing the weight of each term in the single document and the whole document set respectively. And then we combine the two in a reasonable way to obtaining the final weight of the term. The third step is selecting the terms with higher weight to be query expansion words. And high-quality keywords are extracted as query expansion words from the whole document set with the help of the method.Finally, 20 groups of users'queries on different topics are submitted to search engine and the top 50 pages returned by the search engine are exploited to validate the experience. The experience shows great value in practice and can improve the precision and the recall obviously.
Keywords/Search Tags:Search Engine, Query Expansion, Formal Concept Analysis, Term-Reweighing
PDF Full Text Request
Related items