Font Size: a A A

Research On Graph Learning Based Information Retrieval Techniques

Posted on:2011-03-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z Y GuanFull Text:PDF
GTID:1118360302474595Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the proliferation and evolution of Internet and World Wide Web(WWW), WWW has gradually become an important information source in people's daily life. WWW has brought in new challenges as well as opportunities to the information retrieval technology.In the last decade,Web information retrieval technology has undergone a significant development.Nowadays,information retrieval has changed from an academic discipline to the technical foundation of information acquisition for most people in the world.The widespread idea of Web 2.0 has made WWW not only a huge database,but also a platform in which users can participate and communicate with others.The rapid proliferation of Web 2.0 applications will lead to a new round evolution of Web information retrieval technology.This thesis argues that,in the age of Web 2.0,Web information retrieval technology has mainly three evolutionary trends:1) More flexible personalized information services.With rapid increase of users,Web 2.0 Websites pressingly need to satisfy users' personalized information needs.However,traditional Web information retrieval techniques are not expert in dealing with the complex data structures in Web 2.0 applications.Web 2.0 applications need more flexible personalized information services,such as recommender systems.2) More effective multimedia information retrieval techniques.Many Web 2.0 Websites allow users to upload and share multimedia data files,such as pictures and videos.This leads to the rapid growth of multimedia information on the Web.Thus,multimedia information retrieval techniques have become a popular research area.3) Domain or topic specific retrieval.Nowadays,user generated data in Web 2.0 applications has become a significant part of the data of WWW.Huge and topically diverse Web data is forcing Web information retrieval to focus on domain or topic specific retrieval.Web data usually have intrinsic complex relational structures.The thesis points out that in order to better address related retrieval problems or improve the retrieval effectiveness on those Web data,we need to exploit these intrinsic complex relational structures.Graph-based learning techniques can properly model these complex relational structures and capture the knowledge contained in them.Thus,considering the evolutionary trends mentioned above,this thesis focuses on graph learning based Web information retrieval.Specific research topics include:1) Personalized tag recommendation in social tagging services:in social tagging services users can add tags to resources.Tagging data can be modeled as graphs naturally.This thesis proposes a novel graph-based ranking algorithm for multi-type interrelated objects in order to solve the personalized tag recommendation problem in social tagging services.2) Personalized document recommendation in social tagging services:traditional recommender systems focused on rating data,while social tagging data is different from rating data.This thesis proposes a novel graph-based semantic space learning algorithm which projects users,tags and documents iuto the same semantic space. Documents arc recommended to users according to Euclidean distance.3) Face image retrieval and recognition:dimension reduction(subspace learning) techniques were used to learn a high level representation for face image retrieval and recognition.Recently a graph-based tensor subspace learning algorithm showed good performance.However,its time complexity is high.This thesis proposes a novel efficient graph-based second order tensor subspace learning algorithm.4) Focused crawling for high quality topical Web resources:Focused crawlers are designed for harvesting topical Web pages.For vertical search engines,a key problem is how to find high quality related Web resources.This thesis proposes a novel Web graph based on-line algorithm for estimating Web pages' topical quality and, based on it,designs a focused crawler for harvesting high quality topical resources.Finally,the thesis concludes these works and discusses future work on graph learning based Web information retrieval.
Keywords/Search Tags:Web information retrieval, Web 2.0, graph-based ranking, graph-based dimensionality reduction, recommender systems, face retrieval, focused crawlers
PDF Full Text Request
Related items