Font Size: a A A

Tag Recommendation Method Based On Topic Model

Posted on:2020-12-07Degree:MasterType:Thesis
Country:ChinaCandidate:H Z BaoFull Text:PDF
GTID:2428330620454833Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the advent of big data era,various data information on the Internet has grown exponentially.Although users have access to almost all Internet resources,due to the problems such as uncollected resources and data overload,it is usually difficult and time consuming when searching for desired resources.In addition,how to effectively organize and manage such a huge resource pool on the online system platforms has become a serious challenge for managers.Tag is a simple description of the user's semantic information.Making full use of the tag information can not only effectively improve the efficiency of information retrieval,but also help the website to classify and index resources.With the development of the tag system,the number of tags has grown rapidly,but the quality of the tags has been mixed.Due to subjective factors such as user's different habits,cultural level,spelling mistakes,etc.,a large number of irregular,meaningless,and ambiguous tags have appeared.These low-quality tags seriously affect the quality of information retrieval and tag recommendation.Therefore,it is necessary to develop a stable and effective tag recommendation method.At present,most of the tag recommendation methods mainly recommend by using the content information of resources.However,Most data information in the real world do not exist independently.For example,science articles have a complex network structure by referencing each other.Research shows that the topology information and text content information of resources describe the similar semantic features of resources from two different perspectives,and the information from two aspects can complement and explain for each other.Based on the previous research work,the main contributions of this paper are as follows:(1)We proposed a probabilistic topic model and a tag recommendation method for uniformly modeling resource content information and resource network topology information.The method extracts potential semantic information of resources by combining multi-source heterogeneous information such as tag relationships between tags and resource contents and link relationships between resources,and recommend several tags with similar functional semantics for new resources.This paper also design a tag filtering algorithm,which calculate the score for each tag in the candidate tag sets,and sort the tags in the candidate tag sets.The higher score of the tag,the more likely it is to be recommended to the resource.Finally,recommend several tags with high scores to new resources.The experimental results show that the method can improve the effect of tag recommendation to a certain extent.(2)We proposed a tag recommendation method based on word-pair relationship.The method first combines the individual words in the content of the resource text into a pair of words,which alleviates the problem of sparse data to some extent.The resource text content information,the resource-tag matrix information,and the link relationship between the resources are then uniformly modeled.Calculate the similarity between resources by the topic model,the tags that are closest to the new resource topic are recommended to the new resource.The experimental results show that the method we proposed has a higher recall than other baseline methods.
Keywords/Search Tags:Tag recommendation, Tag, Topic model, Heterogeneous network, Word-pair relationship
PDF Full Text Request
Related items