Font Size: a A A

Research On Tag Recommendation Forcommunity Questions Based On GBDT

Posted on:2016-08-09Degree:MasterType:Thesis
Country:ChinaCandidate:W L SunFull Text:PDF
GTID:2308330479490082Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of Internet technology, information posts, updates, spreads with booming speed. According to the demand of information from the people, some kind of encyclopedia platform appears rapidly, which satisfies people’s demands for different professional knowledge. However, people pay more attention to open questions and Q&A communities are built for this reason.Because of the accumulation of the questions in Q&A community, how to deal with the questions efficiently and briefly becomes a problem community administrator has to face. Early communities mark questions by tag based on a method known as Folksonomy. Unfortunately, this method is a double-edged sword. From then on, people try to find if recommending tag automatically can instead of the old form. As research continues, researchers have to face the common concern, such as cold start of the system, data sparseness and huge matrix. Different models have different solutions and different models have different defects. To mix several models together in order to get a high performed model becomes a tendency. In this paper, we explore and analyze how to recommend tags for questions with a machine learning method known as GBDT(Gradient Boosting Decision Tree). The general research content and result are as follows:(1) Firstly, we talk about methods to get candidate question tags. By analyzing the process of keyword extraction, we introduce several natural language processing technology such as Chinese segmentation, POS tagging, Text Rank, TFIDF and after comparing and analyzing different methods we present a method for our model.(2) In terms of question feature extraction, we talk about how to calculate word similarity and extend tag with word embedding and conditional probability. The experiment indicates that class information can improve the precision of out model.(3) On aspect of model selection, we talk about how to mix different models by machine learning method and ranking tags by GBDT. The experiment shows that our method has an 8 percent advance than a certain online platform.
Keywords/Search Tags:Q&A community, question tag, recommendation, GBDT
PDF Full Text Request
Related items