Topics: Personalization, Recommendation Systems, Digital Marketing, Marketing Analytics
Methodologies: Machine Learning, Multi-Armed Bandits, Bayesian Econometrics
with Bas Donkers and Dennis Fok
- Best Paper Award in the PhD track at Marketing Dynamics Conference (2022)
- Amazon Research Award (Amazon Advertising, 2022)
Abstract: Real-time personalization engines help find the optimal offer to provide to specific customers. They thereby enable effective customization in E-commerce. Yet, the development of such engines is not trivial. It remains challenging to optimize an offer strategy in real time, especially in a dynamic environment where the set of available offers varies over time. The complexity is further enhanced when trying to utilize situational information next to customer characteristics. We provide an easy-to-implement personalization engine to quickly learn, and serve, optimal context-dependent offers in a situation where the offer set may change over time. We formalize this personalization problem in the multi-armed bandit framework, and propose a new contextual bandit algorithm boosted by the particle filtering estimation technique. Our method allows firms to flexibly introduce new personalized offers, calibrate their impact using prior knowledge from historical data and rapidly update these prior beliefs as new information arrives. With an application to news-article recommendation, we show that, relative to state-of-the-art competing methods, the proposed method improves lift in click-through-rate and is computationally efficient.
Abstract: Personalization strategies often build on a large set of customer-specific and/or contextual variables to optimally select among many available marketing actions. Contextual multi-armed bandit algorithms can help marketers to adaptively select optimal customized actions. However, conventional contextual bandit algorithms usually consider only a small set of variables, while in real-world problems there are many potentially relevant variables. Exploration is beneficial to identify relevant variables, yet, when faced with a surplus of variables, examining the impacts of all variables can lead to over-exploration and thus inefficiency. To address this challenge, it becomes crucial to leverage an adaptive modeling approach to support the exploration process and to effectively resolve the uncertainty in variable selection. We propose a new approach using variable selection techniques to learn both the optimal model specification and the action-selection strategy. We enhance model interpretability via feature decomposition, to effectively identify both irrelevant and relevant factors. Among relevant factors, we discern between two types: common factors, which have the same influence on consumer behavior for all actions, and hence do not impact the personalized policy, and action-specific factors, whose impact differs across the possible actions and hence do affect the policy. Our method allows firms to run cost-efficient and interpretable bandit algorithms with high-dimensional contextual data.
Abstract: In a dynamic environment, it is crucial for marketers to continuously monitor the effectiveness of their marketing campaigns. An initially successful campaign may later have adverse effects due to factors such as changes in competitors’ strategies or seasonality. This is particularly important in personalized strategies that exploit relations between customer characteristics and the potential outcomes. Shifts in these relations may affect the optimal personalized actions and their profitability. We develop a contextual bandit algorithm with breakpoint detection to accommodate such non-stationary reward patterns. Both simulations and off-policy evaluations on real-world data show the potential of the proposed algorithm compared to existing benchmarks.
with Anja Lambrecht and Nicolas Padilla