Font Size: a A A

Application And Research Of Information Retrieval Algorithm In Web

Posted on:2007-06-04Degree:MasterType:Thesis
Country:ChinaCandidate:W YueFull Text:PDF
GTID:2178360185965642Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, Web has become an important way to find information. Since the online information keeps increasing rapidly, how to improve the effectiveness and efficiency of information retrieval has become the most challenging problem for search engines. Information retrieval is the process to find more relevant information for user's requirement in a collection of documents. This paper mainly conducts research on the algorithms of information retrieval based on search engines.Firstly, we introduced the current research progress and relevant technology in information retrieval, and compared the performance of three classical models for information retrieval systematically. We also investigated the development and some primary models of personalized information retrieval, which is becoming more and more popular with the requirements of personalized and intelligent services.Secondly, with the observation that very short queries in information searching often result in depressed precision, we proposed a novel information retrieval algorithm based on query expansion and classification. Our approach attempts to catch more relevant documents by query expansion and text classification. The results of experiments show that the algorithm we proposed is more precise and efficient than the traditional query expansion methods.Furthermore, we proposed a personalized search algorithm, which combines content-based filtering and text classification, to solve the problem that traditional information retrieval technologies can't satisfy any query from the different background with the different intention and at the different time. In our approach, user's interest model is updated by machine learning, based on the observation of user's behavior when the user is browsing web pages. With the continuous update, user's interest model is becoming more and more accurate to user's interest. Thus, the information in which user is really interested can be retrieved effectively by content-based filtering and text classification according to user's interest model. The results of experiments show that the new algorithm is more effective compared with the traditional information retrieval methods.Finally, a prototype of information retrieval was designed and implemented, based on the above algorithms and techniques.
Keywords/Search Tags:Information Retrieval, Vector Space Model, query expansion, text classification, user profile, content filtering
PDF Full Text Request
Related items