Font Size: a A A

Web Search Based On Social Tagging

Posted on:2012-07-11Degree:MasterType:Thesis
Country:ChinaCandidate:Z B LiFull Text:PDF
GTID:2218330368488068Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of internet, more and more new web resources began to enter the internet which brings more and more information for people. At the same time, the advent of search engines has brought huge advantage in helping people to get specific information from internet. However, the rapid development of internet resources has also brought great pressure for search engines. The retrieval method which only uses page content as page's metadata has encountered a bottleneck. On the other hand, with the rise of web2.0 technologies, many social tagging systems appear in internet. These systems allow people to share their favorite internet resources by adding tags. The tags in the systems are user's depiction of resources and they can be used as a new type of metadata for pages. This paper concerns with the issue of how to enhance web search with the data in social tagging system.At present the research about web retrieval focuses on the following two perspectives. One is to handle the initial queries based on the technology of query expansion, query reconstruction, Pseudo Relevance Feedback and so on. The other is to re-rank the page which is also the point to be considered in this article. The major influential elements of page ranking are the quality of pages and the relevance of pages to queries. After a in-depth analyses of the data in social tagging system, we propose two algorithms to enhance web search.(1) Weighted Social SimRank algorithm. First, we use the data in social tagging system to construct a bipartite graph with only tags and pages as the two type of nodes,and reasonable quantify the weight for edges in the bipartite graph. Based on the characteristic of bipartite graph, we propose a modified SimRank algorithm which we call weighted social SimRank algorithm. This algorithm is used to mine the similarity information of tags and pages. The similarity information between tags is used to enhance web search from the view of relevance between pages and queries. The similarity information between pages is used to enhance web search from the view of page quality.(2) Social quality algorithm. This algorithm can calculate the quality users and pages in social tagging system. The quality of users is up to the semantic correlation between the tags they created and the corresponding resources. The quality of resources is up to the number of people that have tagged them. For a special resource, the more the people that tagged it, the more popular of it and the better the quality of it. Meanwhile, there is mutual reinforcement between users and resources. For high quality users, the resources they tagged always are high quality resources. For high quality resources, the users who tagged them always are high quality users. The quality information of resources calculated by social quality algorithm can be fused into the procedure of page ranking to enhance web search.Experiments are carried out on a real-world annotation data set which is sampled from del.icio.us and we have used several methods to evaluate the experimental result. The experimental result demonstrates the significant improvements over traditional methods and the effectiveness of the proposed two algorithms.
Keywords/Search Tags:Web Search, Social Tagging, SimRank, Similarity, Social Quality
PDF Full Text Request
Related items