Font Size: a A A

Optimization Technology Of WEB Search Based On Social Tags

Posted on:2011-01-28Degree:MasterType:Thesis
Country:ChinaCandidate:D LiFull Text:PDF
GTID:2178330338479984Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of Web2.0 technology,a lot of websites which allow user to add resources and tags rise quickly,The typical examples are Del.icio.us and Flickr. On these sites, users must register for a user name , then, they can publish and mark their own resources, and mark other people's resources. There are many marked resources, such as: picture (Flickr), bookmarks (Del.icio.us), video (YouTube) and so on. At present, Researchers usually call these annotations of resources as social tags. With the rapid development of the website, there are more and more social tags, social tags have a strong correlation with the user and they are very real, How to make better use of social tags has become a focus of attention.Social tags reflects the social focus of the user or points of interest, they can truely reflect the public's perception. The users add social tags to resources using their own language freely, social tags are richer than the titles of Web pages. Second, the social tags are more accurate than the metadata which are automatically extracted by Machine. Again, social network has three type of entity: user, tag and url, and the three type of entity have structural relationship. This paper makes use of the features of social tags to improve Web search, completed for the following three areas:Firstly, this article on how to extract social tags from social network has been studied. Extraction process is divided into two steps: Web crawl and Web analysis. In the step of Web crawls, we have successfully crawled a large number of web pages containing social tags by controlling the url. In the step of Web crawls, we found the characteristics of each entity(user, tag and url)by analyzing the web pages, we make use of these features extracted three kinds of information, and insert these information into the SqlServer database.Secondly, we model the site, and consider the structure of social network as an undirected tripartite graph. Followed, PageRank algorithm is then briefly introduced, and according to their ideas, we put forward in the social network in a PageRank-like algorithm, at last we use PageRank-like algorithm to calculate the popularity of web pages.Thirdly, we use classic RankNet algorithm to incorporate the popularity of web pages into the traditional ranking function. We resort the results of the traditional sort, and compare them with the results of the traditional sort, we found that the resort results are better than the traditional sort results. We obtain the training set used in RankNet and test set from ODP in machine method.
Keywords/Search Tags:Web2.0, social tags, popularity of web page, learn to rank
PDF Full Text Request
Related items