Font Size: a A A

Methode de recherche adaptative sur le Web avec utilisation de Wikipedia pour l'expansion de requetes

Posted on:2007-05-03Degree:M.Sc.AType:Thesis
University:Ecole Polytechnique, Montreal (Canada)Candidate:Oveissian, Amir MasoudFull Text:PDF
GTID:2448390005972742Subject:Engineering
Abstract/Summary:
In spite of the remarkable advances in information retrieval technology which are retrieved in the recent decade, some difficulties still exist as the obstacles to find the desired information for the searching users. Among these difficulties, we can name the limited number of terms which may be given in a search query, the lack of ways to better interpret the exact user desires, and the impossibility to choose the users' common interest as a robust basis for each particular user's query interpretation. These difficulties are the potential aspects of information search which argue to the needs to find some accurate measures to have a better precision and a better interpretation of each particular user's interest in each one of his searching subjects.; There are several user side search agents or global search engines that have been proposed in recent years. These projects mainly apply Information Filtering (IF), Information Retrieval (IR), query expansion, or semantic web techniques to improve the search results through the user's specialized long or short term interest or a group of user's public general interests. Most of them have some shortcomings. Some are not really supporting the different general purposed user searching types. Some are not adaptive according to each of the user's different interest subjects and his shifts in interest. Some others need a large set of user examples and a time-consuming batch preprocessing over the user interested documents. Finally, some of them are not clearly applicable to other languages than English.; This project is a crossroad to this issue in which we will introduce the architecture of ARIIA (Adaptive Reinforcement Iteration-based Internet Agent); a language independent user-side search agent that expands the user search queries according to his long time personal or professional interests and can adapt itself according to the user shifts in interest. To realize the functionality of ARIIA, we have used a Wikipedia3 ontology extraction approach based on the Wikipedia hyperlinks that guaranties us a progressive evolution in user personal vocabulary during his continuous and real-time searches. ARIIA's functionality is based on several aspects as: (1) ARIIA uses the relevance feedback in its document selection. (2) It uses the hyperlink structure to measure the relevance of the terms and expands the search queries. (3) It mainly relies on the available existing infrastructure and services for its search. Talking in more details, it uses the Wikipedia free online encyclopedia and a general purposed search engine such as Google (or AltaVista) for its searches. (4) It creates a Boolean query with disjunctive expanded set of terms based on an encyclopedia hyperlink and the user feedback over the importance of the links. (5) It mainly relies on the iterations over the expanded queries and the relavance feedback. (6) It creates a user profile based on the query history and user interesting document contents. (7) It can be used as a Multilanguage search agent. ARIIA can be applied to searches with in any language which is used in Wikipedia.; To evaluate ARIIA's functionality, we define several measuring indicators considering the contents and the order of the resulted pages. The ARIIA's results are based on the Google's results and, in general, are almost better than the Google's: in the worst case, ARIIA shows us an improvement about at least 4 to 6 percent depending to the search type. In most of the cases this improvement stabilizes on 10 percent compared to the Google's result, and in its best situation, ARIIA improves the Google's result up to 20 percent and more.; 3A free encyclopedia available at: www.wikipedia.org...
Keywords/Search Tags:Wikipedia, ARIIA, Search, User, Information, Google's
Related items