Font Size: a A A

The Research Of User Profile Model Based On Search Data

Posted on:2019-02-23Degree:MasterType:Thesis
Country:ChinaCandidate:T T QuanFull Text:PDF
GTID:2439330596466309Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
While the Internet brings convenience to us,it also brings with information security issues that cannot be underestimated.In recent years,incidents of user information leakage have occurred frequently.This not only involves the user's personal privacy,but also involves the strategic security issues of enterprises,countries,and government agencies.For this reason,users are increasingly reluctant to expose their true information to third-party platforms.It is very difficult for a search company to obtain data such as attributes and preferences of users.However,user's basic attributes and data of preference are also crucial for advertising in the search field.The most direct data that the search company can obtain is the user's search data.These rich user's search data can fully describe the user's basic attributes and preferences,which is the true performance of the user's concerns.The user's attributes and preferences can be characterized by the user's profile.The user's profile is the basis for the company to deliver personalized recommendation a user or a certain type of user,and contains a huge commercial value.Based on the above background,the research was conducted with the search data of 20 w Sogou users in the 2016 CCF Big Data and Computing Smart Contest.Firstly,the development of advertisement placement in the search field was introduced.From the perspectives of short text analysis and user profile,relevant literature and key technologies were combed,and a user profile research framework based on search data was proposed.Secondly,a search word hybrid feature extraction model based on the three dimensions of Doc2 Vec,TF-IDF and artificial features was proposed.In the extraction of TF-IDF features,referring to the existing research ideas,Word2 Vec word vector weighting method is used to improve the TF-IDF algorithm,and through experiments,it is verified that the algorithm is improved efficiently.Then,it analyzes the main factors that affect the advertising of search companies and selects two research points that have a greater impact on advertising,namely basic user attributes and user preferences.It proposes user's profile label and user profile building process based on search data.Then,the extracted mixed features are used as the input of the user's profile basic attribute model,and a variety of machine learning algorithms are used to train the model.The user's profile basic attribute model is obtained based on the idea of model fusion.Then,the obtained hybrid features are reduced dimensionally and merged with the user's basic attributes.They are used as input for the user's profile preference model together.Using the K-Means clustering algorithm and referring to the Sina Weibo topic tag,a user profile preference model is constructed.Finally,the model application proves the effectiveness of the model.Based on the research results of each stage,the research also proposes corresponding suggestions for the introduction of advertisements and advertisement placement,etc.in the search field,at present,applications in games,e-commerce,food,search,etc.,all generate large amounts of textual information,and a very large proportion of them are short textual information.The hybrid feature extraction model and the user profile model presented in this study can not only solve the problem of short-text data feature extraction and user profile model construction in the search domain,but also can be extended to user attribute prediction and other fields.
Keywords/Search Tags:Search Data, Machine Learning, Hybrid Feature Extraction Model, User Profile Model
PDF Full Text Request
Related items