Research On The Knowledge Representation And Model Ensemble In User Portrait Construction

Posted on:2018-08-14

Degree:Master

Type:Thesis

Country:China

Candidate:H C Li

Full Text:PDF

GTID:2348330536460921

Subject:Computer software theory

Abstract/Summary:

PDF Full Text Request

With the Internet technology,especially Google and Baidu as the representative of the rapid development of search engines,the Internet data show explosive growth.Data is our era of oil,how to efficiently and systematically develop the use of these massive data is particularly important.Every day through the search engine we leave a large number of records such as historical query words,these data for the analysis of user population attributes and hobbies,meticulously and completely building user portrait,provides a wealth of data base.Making full use of user behavior record data,abstracting the user attribute information panorama,can be seen as the basis for enterprise application of large data.In 2016,the big data contest “Sogou User Portrait Mining” held by China Computer Federation,provided a month of query words and the user’s population attribute labels(including gender,age,education)as training data.For the user history query word data,we systematically compared and analyzed a variety of knowledge representation methods,Bag of Ngrams method reflects the differences in user language habits,Topic Word Embedding was used to extract the user query word theme information,Doc2 Vec was used to summarize the semantic association information between the user query words.In addition,for the user query words,we have specially improved the Doc2 Vec model.Respectively,we proposed two algorithms,Query Document Vector: Distributed Bag Of Words(qdv-dbow)and Query Document Vector: A Distributed Memory model(qdv-dm),which further enhance the quality of knowledge representation of the query words.For the user portrait building tasks,we presented a two-level ensemble algorithm framework for predicting multidimensional population attribute tags(including gender,age and education).(1)In the first-order single-task models,we combine the Trigram feature with the traditional machine learning model to summarize the differences of user’s words habit,and combine the Doc2 Vec knowledge representation with the neural network model to extract the user query semantic association information.(2)In the first-level multi-task models,we use the Very Deep Convolutional Neural Network model to extract the context-related information from the granularity of the character,and use the FastText neural network model to characterize the user’s query information from the granularity of the word.(3)In the second-order ensemble model,we use XGBTree model and the Stacking multi-model fusion method to comprehensively extract the association information between the attribute labels of the user’s portrait,and further enhance the generalization ability and prediction accuracy of the model.The proposed two-level ensemble algorithm framework won the championship in the big data contest "Sogou user portrait mining".

Keywords/Search Tags:

User portrait, Knowledge Representation, Model Ensemble, Tag Prediction, Deep Learning

PDF Full Text Request

Related items

1	Research On User Behavior Prediction Model And Its Application Based On Deep Walk And Ensemble Learning
2	Research On User Portrait Algorithm Based Userondynamic Networkinterest Model Behavior
3	Research On User Portrait Generation And Behavior Prediction For Securities Investment
4	Analysis And Research Of User Portrait Construction Algorithm Based On Behavior Data
5	Research On User Portrait Modeling And Ensemble Algorithm In Personal Credit Field
6	Research On Knowledge Representation Learning Method Based On Deep Embedding
7	Research And Implementation Of A-share Portrait And Rotation Based On Deep Learning
8	Research On Deep Learning Based Knowledge Graph Representation Learning Algorithm
9	Research And Application Of Family User Portrait Based On Deep Learning
10	Knowledge Representation Learning Based On Fine-grained Entity Description Information And AcrE Model