Font Size: a A A

Intelligent And Personalized Research For Web Search Engine

Posted on:2009-06-30Degree:MasterType:Thesis
Country:ChinaCandidate:J Q XuFull Text:PDF
GTID:2178360272475131Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Along with the amount of Web documents on Internet grows rapidly, we are facing a lot of new challenges in the research of Web search. A vast majority of queries to search engines are short and under-specified and users may have completely different intentions for the same query. Currently, most of the main Web search engines are built to server all users, independent of the special needs of any individual user. In order to improve web search quality, personalized web search has now become to be a focus research in the domain of Web information retrieval. This paper has a further study on it, proposes intelligent and personalized information retrieval research for Web Search Engines. It not only makes good use of the advantages of popular search engines, such as a fast response to user query and a huge amount of information and resources for users, but also can provide relevant search results for people with different interests and background.The main research includes such aspects as below:①In vector space model traditional approaches to calculate terms associations are analyzed in detail. In order to effectively analyze the relation between feature terms in an interest category of a user, this paper proposes a novel algorithm measuring term associations based on user profiles. The algorithm combines cosine similarity measures with co-occurrence data analysis. Quantitative correlation analysis between feature terms relevant with users is built, and servers for query expansion.②A user query can be accurately mapped relevant interest categories in a new user interest model which combines with user's browsing content and behavior. A personalized query expansion algorithm is proposed by computing the term-term associations according to the current user profile. When the user inputs query keywords, the system can automatically generate a few personalized expansion words, and then these words together with the query keywords are submitted to a popular search engine such as Yahoo or Google. These expansion words help to express accurately the user's search intention. The new query expansion can make a common search engine personalized, that is, the search engine can return different search results to different users who input the same keywords.③The presence of replicas or near-replicas of documents is very common on the Web. These near-replicas that a search engine returns increase the burden on Web users and decrease the quality of searching service. This paper proposes a method based on content analysis to detect similar pages, in particular replicas and near-replicas. In order to further improve Web search quality, the method is applied to detect and remove replicas and near-replicas in the top N documents, which are returned by a search engine.In section 5, experimental results show the affectivity and feasibility of the present work. The research above has good academic reference value and good applied value in the domain of personalized Web search.
Keywords/Search Tags:Personalization, Information Retrieval, Search Engine, Query Expansion, Duplicated WebPages Deletion
PDF Full Text Request
Related items