Font Size: a A A

Essays on recommender systems: Impact of sparse data and an information theoretic segmentation approach

Posted on:2009-06-13Degree:Ph.DType:Dissertation
University:The University of Texas at DallasCandidate:Mescioglu, IbrahimFull Text:PDF
GTID:1448390002999134Subject:Business Administration
Abstract/Summary:
Firms are adopting a variety of recommendation systems to personalize their offerings through their websites. A common problem for the several existing personalization methods is the limited availability of data about the customer preferences for products. The companies try to predict customers' future preferences using the data from past sales. When all possible customer and product pairings are considered the available data that can be used to base predictions on is quite sparse. I study two problems to overcome this inherent nature of the data: (i) explaining the impact of sparse data on different recommender systems and (ii) proposing an information theoretic segmentation approach to better use the sparse data for making predictions about customer preference for a product.;For the first problem, I examine how several prominent methods make recommendations based on sparse datasets (that is a typical characteristic of transactional or preference data). I identify the theoretical shortcomings of the various techniques when dealing with sparse data sets, and find that the non-parametric Bayesian network models are least affected by missing data. Using two real I validate this finding empirically for two personalization and recommendation scenarios. Consistent with prior research, I find that model-based approaches outperform memory-based approaches in most instances. I further show that (i) the probabilistic approaches significantly outperform the non-probabilistic ones for all of the scenarios considered, and (ii) the non-parametric Bayesian network models perform better than the parametric probability models in general. Further, I show that as the data becomes more sparse, the performance of nonparametric models improve relative to the other models.;For the second problem, I propose a unique information theoretic segmentation methodology that explicitly uses the distribution of customer preferences for the target products to better identify the customers with similar preferences. In order to segment customers, the two types of data generally available are demographics and preferences of customers for other products. Various methodologies are compared to the proposed algorithm and I find that proposed segmentation approach either outperforms or performs as well as the traditional segmentation approach for all scenarios considered.
Keywords/Search Tags:Segmentation approach, Data, Information theoretic segmentation, Systems
Related items