| Internet is an important channel for knowledge dissemination in today's society.Every Internet user becomes an important node for knowledge dissemination online.Users are becoming producers of knowledge on the Internet,in addition to dissemination through such means as transmission and diffusion.Through such UGC production,individual users share knowledge,enrich the content and types of knowledge,change the traditional single PGC production in the form of Knowledge Organization and Information Dissemination.In recent years,the rise of "Q & A" community is one of the embodiment of user knowledge sharing.Knowledge “Q & A” community is a knowledge sharing community based on user search,which emphasizes user interaction,that is,users search for answers or ask questions in the community according to their own needs,other users share their knowledge and experience to answer questions.Foreign to the most well-known Quora community,domestic Zhihu network is highly active,interactive knowledgesharing community,was the "Best Chinese Q & A community" title.High-quality content,a rational and professional discussion atmosphere,and high-activity users' participation in Zhihu's core competitiveness.In order to further enhance the quality of community content and interactive activities,Zhihu has introduced a "question invitation mechanism" in recent years,the aim is to match the user's question in the community with the user who is suitable to answer,and to speed up the speed of problem solving.On the one hand,the expansion and development of community knowledge base is realized by user's knowledge sharing,on the other hand,the stickiness of user to community is promoted.In order to improve the success rate of the question invitation,we can analyze the user's past behavior data and the text data of the question,extract the effective features,find out the key factors that affect the answer behavior,and establish the prediction model,in order to improve the efficiency of community knowledge sharing,the specific questions are matched to the users who are most likely to answer them.This research is based on Zhihu network data,takes the user as the research center,establishes the statistical model to many indexes in the user data,it was found that the number of historical responses,likes,activity,interest preferences and salt values(A rating system for users in the Zhihu community)were positively correlated with whether a user accepted an invitation or not.In the prediction model established by the machine learning algorithm,we find that these five features are still significant for the accuracy of the prediction model.The work of this research mainly includes the following aspects: combining the existing technology and methods of user behavior prediction in many fields,such as retweet behavior prediction of microblogging platform and CTR prediction of Ad Click rate,etc.,by analysing the characteristics of users and questions in the social Q & a community,using officially published data on large-scale user behaviour,including indicators on users and questions,through the method of feature engineering in machine learning,a large amount of raw data is processed into data which can be received and processed by neural network.The core path is to take the user as the main research direction,mining useful information from the user's history behavior,including some information that reflects the user's characteristics and the information that reflects the user's interaction with the problem to predict the user's future behavior.This paper designs and constructs an Internet Q & a Community User Answer Behavior prediction model to predict the Internet Q & a Community User Answer Behavior,which is technically based on machine learning framework and uses the promotion tree Light GBM algorithm,and configure the corresponding development environment for Code Implementation.Finally,the predicted AUC value is 0.887937,and the effect is good.This research starts from the user history behavior data,analyzes the user's answering habits and preferences,combines the existing acceptance / rejection data,trains the prediction model,and carries on the verification on the data set.The abundant data samples make this study fully excavate the key features hidden in the user and question data,and provide some research ideas and techniques for other user research fields. |