| With the fast development of Internet technology and the promise of increased liquidity of information, online advertising has been the most popular and effective marketing approach, and accounts for the majority of the income for the major internet companies. CTR(Click Through Rate) prediction plays an important role in online advertising, which has great impact on the profit of advertiser, publisher and also the internet user, so it has caught a lot of attention from the internet companies. In this paper, we focus on display advertising to give a systematical introduction and analysis of online advertising system and roles that are involved.We focus on three problems in online advertising system. The first problem is about constructing feature engineering platform. In this paper we provide methods on doing feature engineering to extract and integrate useful information in real-world applications. Secondly, we need effective model to do accurate CTR prediction task.Currently, machine learning tasks mostly focus on linear model, which is hard to build the relationship between user features and ad features. In this paper, we provide a novel model, called Coupled Group Lasso(CGL) to get more accurate CTR prediction model. In addition, our model has the effect of feature selection, which is useful for fast online prediction. Thirdly, we need to make our method scalable to fulfil the requirements of real-world applications. In this paper, hash tricks and MPI-based distributed implementation of our algorithm are used to handle web-scale datasets and make our model scalable. Experimental results on real-world data sets show that our model can achieve state-of-the-art performance on web-scale CTR prediction tasks. |