Font Size: a A A

Research On Topic Model Based Multi-label Text Classification And Recommender Systems

Posted on:2020-03-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y P ZouFull Text:PDF
GTID:1368330575478766Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Today,with the rapid development of the Internet,massive amounts of information about disseminated,political,economic,entertainment,educational,cultural,academic papers and other information are published through the Internet in many forms,such as text,image,sound,video,structured documents and so on.The increasing amount of information increases the difficulty of information retrieval and poses a higher challenge to the orderly management of information.This requires more effective methods and tools to organize,understand and retrieve massive information efficiently and automatically,to improve the efficiency and accuracy of information utilization,and to reduce the difficulty of information utilization.Multi-label classification is an effective way to organize and utilize information resources such as text,picture,video,structured document and so on.Multi-label based automatic classification technology can improve the efficiency of information processing,save the cost of manual processing and improve user experience.In recent years,it has been widely concerned,and has become a hot research direction in the field of information retrieval and data mining.The topic model method represented by LDA is an effective method for automatic analysis of text information.It can reveal the latent semantics of documents and analyze the topics contained in massive information.It is an important text automatic processing technology and has been widely used in many fields such as multi-label text classification,recommender system,etc.This paper mainly studies the supervised multi-label text classification method based on the topic model and the personalized recommendation method combined with the topic model.1.The L-LDA and Dependency-LDA models are classic multi-label text categorization methods,but they all ignore the class frequency knowledge of the terms,that is,the number of labels assigned to a term in the training data.That is to say,these models mainly focus on the weights of the terms in the tags,while ignoring the weights of the terms between the tags.In order to solve the above problems,we try to add a term weight assignment step in the supervised topic model,and propose a method of weighting terms using class frequency knowledge,called CF-weight method: for terms with lower/higher classe frequency,give greater/smaller weights.In this paper,we use the CF-weight method to improve the L-LDA and Dependency-LDA models,and propose the WL-LDA and WD-LDA models: in these models,each term is weighted using its corresponding CF-weight.The experimental results show that the CF-weight based model has better classification effect than the existing supervised topic model.2.The basic purpose of personalized tag recommendation is to provide a set of candidate tags when the user labels the resources.The candidate tags are not only related to the content of the resource,but also related to the user’s interest preference.This paper proposes a personalized tag recommendation model SIM-LDA-TAG based on the topic model: exploiting the relationship between tags,users and resources to mine latent user interest topics and resource content themes,and personalize tag matching for users and resources.Experiments show that the SIM-LDA-TAG model has better recommendation effect than the existing mainstream methods when apply them to do personalized label recommendation for social sharing websites.3.The basic purpose of personalized resource recommendation is to recommend a set of candidate resources for the user,and the candidate resources are related to the user’s interest preferences.This paper proposes a resource recommendation model SIM-LDA,which combines topic model and collaborative filtering method.Firstly,collaborative filtering method is used to recommend resources by leveraging the labels assigned to resources by users.Secondly,the topic model is modeled for users and resources,and the users and the resources are matched by topic similarity to achieve resource recommendation.Finally,the two recommendation results are mixed by weight adjustment.Experiments show that the SIM-LDA model has better recommendation effect than the existing mainstream methods when apply them to do personalized resource recommendation for social sharing websites.
Keywords/Search Tags:Topic model, LDA, Multi-label classification, Tag recommendation, Recommender Systems
PDF Full Text Request
Related items