Font Size: a A A

Research On Click-Through Rateprediciton Of Sponsored Search Using Multi-Class Features

Posted on:2014-01-19Degree:MasterType:Thesis
Country:ChinaCandidate:T LiuFull Text:PDF
GTID:2248330398972146Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Click-through rate (CTR) of sponsored search, plays a major quantitative indicator role for search engine provider and advertiser. Therefore, CTR prediction is one of key areas in advertisement computation field. Scholars of industrial circle and academic circle have continuously studied CTR prediction. In addition, every search engine provider has established their own CTR prediction system. It follows that this subject is characterized by immense research significance and practical value.Centering round CTR prediction, this thesis carries out a complete method modeling. In the first place, it studies attributes of sponsored search of searching engines and summarizes its five major attributes. Then, on this basis, it defines both explicit features and implicit features of advertisements and extracts related features. It also introduces Probabilistic Relation Model into feature selection and classifies them into categories that are directly related, indirectly related or completely unrelated to real CTR values. Next, it introduces Factorization Machines as the prediction model whose input end is the real valued feature vector after feature selection. For the estimated CTR result, AUC (Area Under Curve) is used to generate the evaluation result. It is worth noting that, in current researches, feature selection mainly focuses on position and advertisement attributes without considering advertisement-triggered scenes and relationship between advertisement and user’s query. The existing CTR prediction based on advertising category features method makes use of mean CRT value of advertisements of the same category for prediction without mixing them with other features to strengthen the prediction. A direct clustering of advertisements can also enable us to categorize advertisements, but the category of advertisement in this manner is unique and defined as uni-class feature of advertisement. Online advertisement itself has many themes, so uni-class label loses its significance in different user’query scenarios. Thus, this thesis proposes a CTR prediction method based on multi-class feature. After defined user’s information retrieval activates advertisement, extract multi-class features of advertisement needs indirect clustering. Finally, Factorization Machines puts in multi-class features to predict CRT values. Findings show that advertisement multi-class features help increase CRT prediction accuracy apparently on the basis of primary fact features. Compared with direct clustering, indirect clustering in feature extraction helps realize multi-class label of advertisement reduces dimension of sparse eigenvector in great size and decreases time cost of clustering effectively.
Keywords/Search Tags:sponsored search, click-through rate, multi-class features, probabilistic relational models, factorization machines
PDF Full Text Request
Related items