Font Size: a A A

The Study Of Query Expansion In Social Tagging

Posted on:2014-02-23Degree:MasterType:Thesis
Country:ChinaCandidate:H WangFull Text:PDF
GTID:2248330398450366Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Social tags are user-generated keywords associated with some resource on the Web. In contrast to the traditional top-down approach, social tags come from Web users, as an emerging important source of information, social tags can be shared with the larger community of Internet users, which exemplifies the’Spirit of Web2.0’. With the rise of the Web2.0movement, many websites allow users to create and manage their social tags, the term folksonomy has gained wide attention from more researchers. A lot of research has shown that social tags are conducive to improving search quality. However, social tags in the real life are often spare, some of them are generated by machine and become ineligible for use.To address the issue, this paper explores two approaches, which are used for tag expansion and evaluation in order to improve the quality of query expansion. The first is Jaccard SimRank algorithm. The traditional metrics like Cosine similarity and Jaccard Index are almost ineffective when social tags are sparse. The graph based SimRank algorithm is used to address this problem, but it’s not as effective as expected due to inefficient use of social tags. In this paper we propose an enhanced Jaccard SimRank algorithm, which gives a more directive description of tag similarity. The expanded tags are more accurate due to the enhancement of tag similarity, hence the search quality is improved accordingly. The second is Folksonomy Quality algorithm. The algorithm is to the assumption that the quality of tags is measurable, since tags can be evaluated by other users through voting, the tag that gets the most votes is seen as the most suitable one for the web resource associated with it. Not only can evaluation be applied to tags, but also be applied to its owner by assigning a weight, and the web resource will also gain weight when it’s voted by a user. This algorithm uses weights to address the issue caused by the auto-tags from machines and improve the search quality.The experimental datasets in this paper come from bibsonomy website, we have applied the two approaches proposed in this paper against the test datasets, calculated the metrics including Cosine similarity, Jaccard Index, we have also gathered the result using traditional methods, such as SimRank algorithm and JSR algorithm, and finally evaluated all experimental results. The evaluation has proved that these two approaches can effectively furnish tag expansion and improve the quality of query expansion.
Keywords/Search Tags:Query Expansion, Social Tagging, SimRank, Similarity
PDF Full Text Request
Related items