LSTM Response Models for Direct Marketing Analytics: Replacing Feature Engineering with Deep Learning

doi:10.1016/j.intmar.2020.07.002

Journal of Interactive Marketing

Volume 53, February 2021, Pages 80-95

https://doi.org/10.1016/j.intmar.2020.07.002 Get rights and content

Highlights

•
Traditional customer response models rely heavily on feature engineering.
•
Their performance depends on the analyst's domain knowledge and expertise to craft relevant predictors.
•
In contrast, long-short term memory (LSTM) neural networks rely exclusively on raw data.
•
Still, we demonstrate LSTM models provide superior performance to traditional models.
•
LSTM neural networks are excellent candidates for direct marketing, brand choices, clickstream data, and churn predictions.

Abstract

In predictive modeling, firms often deal with high-dimensional data that span multiple channels, websites, demographics, purchase types, and product categories. Traditional customer response models rely heavily on feature engineering, and their performance depends on the analyst's domain knowledge and expertise to craft relevant predictors. As the complexity of data increases, however, traditional models grow exponentially complicated. In this paper, we demonstrate that long-short term memory (LSTM) neural networks, which rely exclusively on raw data as input, can predict customer behaviors with great accuracy. In our first application, a model outperforms standard benchmarks. In a second, more realistic application, an LSTM model competes against 271 hand-crafted models that use a wide variety of features and modeling approaches. It beats 269 of them, most by a wide margin. LSTM neural networks are excellent candidates for modeling customer behavior using panel data in complex environments (e.g., direct marketing, brand choices, clickstream data, churn prediction).

Introduction

In direct marketing, a firm targets a customer with a marketing solicitation such as a catalog, a direct solicitation, or a coupon, and the customer decides whether or not to respond. Since soliciting a customer unlikely to respond is unprofitable, and not soliciting a potentially profitable customer leaves money on the table, the ability to predict customers' responses has long been a crucial endeavor for both practitioners and academics (e.g., Malthouse, 1999, Roberts and Berger, 1999).

Response models in direct marketing predict customer responses from past customer behavior and marketing activity. These models often summarize past events using features such as recency or frequency¹ (e.g., Blattberg et al., 2008, Malthouse, 1999, Van Diepen et al., 2009), and the process of feature engineering has received significant attention (Kuhn and Johnson, 2019, Zheng and Casari, 2018).

In machine learning, a feature refers to a variable that describes some aspect of individual data objects (Dong & Liu, 2018). Feature engineering has been used broadly to refer to multiple aspects of feature creation, extraction, and transformation. Essentially, it refers to the process of using domain knowledge to create useful features that can be fed as predictors into a model.

However, feature engineering presents its own set of challenges.

First, the same features might identically summarize widely different behavior sequences (Blattberg et al., 2008, Fader et al., 2005). Consider the customer behavior pattern depicted in Fig. 1. All four customers in the figure have the same seniority (date of first purchase), recency (date of last purchase), and frequency (number of purchases). However, each of them has a visibly different transaction pattern. A response model relying exclusively on seniority, recency, and frequency would not be able to distinguish between customers who have similar features but different behavioral sequences.

Second, in a complex environment where there are multiple streams of data, such as in a data-rich environment where the analyst has access to historical marketing activity of various sorts (e.g., multiple types of solicitations sent through various marketing channels) and diverse customer behaviors (e.g., purchase histories across various product categories and sales channels) observed across different contexts (e.g., multiple business units or websites, see Park & Fader, 2004), the vast number and exponential complexity of inter-sequence and inter-temporal interactions (e.g., sequences of marketing actions, such as email–phone–catalog vs. catalog–email–phone) will make the data analyst's job arduous.

Let us reflect for a moment on one of the simplest and most commonly used features in direct marketing: recency, or the time elapsed since the last customer's purchase. How should the analyst hand-craft relevant recency features in an environment spanning multiple product categories? Should she take into account the last absolute recency, regardless of the product category purchased (hence losing richness and granularity, and potentially hurting the model's predictive power)? Should she include in the model as many recency indicators as there are product categories in the data set (hence creating excruciating multicollinearity issues if customers buy from multiple product categories at each purchase occasion)? Should she combine individual and aggregate recency indicators? When crafting relevant recency indicators, should the analyst consider purchases in brick-and-mortar stores and purchases on the firm's website jointly, or should she treat these indicators separately?

When an analyst uses feature engineering to predict behavior, the performance of the model will depend greatly on the analyst's domain knowledge, and in particular, her ability to translate that domain knowledge into relevant features for the model. In complex environments, such as in the presence of multiple channels or multiple product categories, it can be quite challenging indeed for an analyst to capture all useful inter-sequence and inter-temporal interactions.

In this paper, we explore whether Long-Short Term Memory neural networks (LSTM), a special kind of Recurrent Neural Networks (RNN), which rely on raw sequential data and do away with feature engineering, can offer the promise of a solution to this general class of modeling problems in marketing.

In customer response models, the data are often in the form of panel data, where the firm's actions (e.g., solicitations) and customers' behavior (e.g., purchases) are observed repeatedly over time and along multiple dimensions (e.g., multiple channels or product categories).

Surprisingly, while RNN models are common in natural language processing, their applications—let alone marketing panel data—have been scarce, and even close to nonexistent. In their seminal book, Goodfellow, Bengio, and Courville (2016) cite applications of RNN in the domains of machine translation, prediction of text sequences, handwriting recognition, and speech recognition. Pointer (2019, p. 70) mentions in passing that RNNs are particularly suited for “data that has a temporal domain (e.g., text, speech, video, and time-series data),” but dedicate the chapter to text analysis. Saleh (2018) dedicates an entire section to the numerous applications of RNN (pp. 153–157), but exclusively cites natural language processing, speech recognition, machine translation, unidimensional time-series forecasting, and image recognition. However, as we will demonstrate, RNN models in general, and LSTM models in particular, seem particularly suited for panel data analysis.

We organize the paper as follows. In the first section, we introduce the LSTM model as a special class of recurrent neural networks. Given the newness of the method to social scientists in general, and to marketing analysts in particular, we dedicate significant space to explain its inner working. While LSTM models take raw behavioral data as input and therefore do not rely on feature engineering or domain knowledge, our experience taught us that some fine-tuning is required to achieve optimal LSTM performance; in the second section, we pay special attention to the proper calibration of an LSTM model, including parameter and hyperparameter tuning, which can be fully automated and do not require domain knowledge either. In the third section, we demonstrate the superior performance of the LSTM model in a relatively simple, direct marketing setting with only donations (yes/no) and solicitations (yes/no). We show that the LSTM model, relying on raw data, achieves a better average fit and performance than the feature-based, benchmark models. In the fourth section, we benchmark a vanilla LSTM model in a much more complex environment (e.g., multiple channels and donation types) against 271 hand-crafted models developed by about as many human analysts. The LSTM outperforms 269 of them. In the fifth section, we discuss the marketing applications in which we expect LSTM neural networks to prove valuable, and important technical considerations in the fast-moving field of deep learning in the sixth section. We conclude in the seventh section.

Section snippets

Recurrent Neural Network (RNN)

In a traditional feedforward neural network, a vector x is processed through propagation in a neural network and produces an output vector y, as depicted in Fig. 2(A). Recurrent neural network (RNN) is a kind of artificial neural network (ANN) that is adapted to model sequential tasks. Rather than relying exclusively on the vector x to make its predictions, an RNN will also use part of the output of the previous iteration (the hidden state) as input for the next prediction (see Fig. 2(B)). By

Bias, Variance, and Model Capacity

As discussed in the LSTM model section, the parameters of the LSTM module/cell are W_u, W_f, W_o, W_c, b_u, b_f, b_o, and b_c. We use the parameters W_y and b_y to generate the predictions of ŷ^<t> from the hidden state of the LSTM. The dimension of the LSTM weight matrices depends on the dimension of the hidden state (referred to as hidden units) and the number of input features in x.⁴

Objective

While an LSTM model does not depend on the analyst's ability to craft meaningful model features, traditional benchmarks do heavily rely on human expertise. Consequently, when an LSTM model shows superior results over a traditional response model—as we have shown in the previous illustration—we cannot ascertain whether it is due to the superiority of the LSTM model, or to the poor performance of the analyst who designed the benchmark model.

To alleviate that concern, we asked 297 graduate

Applications of LSTM Neural Networks in Marketing

Though we set our studies in a direct marketing context, LSTM neural networks can provide a solution to the general class of prediction tasks that involve panel data. We foresee that, since panel data is ubiquitous in marketing, LSTM neural networks can find widespread applications in marketing academia and practice. We discuss some possible applications below.

Technical Considerations

It would be presumptuous to claim that LSTM models offer an ideal, one-fit-all solution to panel data analytics. In particular, the analyst is invited to be mindful of the following challenges.

First, hyperparameter tuning is not a trivial task. While a simple grid search may be sufficient to achieve optimal performance, Bayesian optimization may be required on occasion.

Second, as in all deep learning models, overfitting is a constant concern. Many solutions have been proposed, and can even be

Conclusions

Ben Weber (2019) stated that “One of the biggest challenges in machine learning workflows is identifying which inputs in your data will provide the best signals [i.e., features] for training predictive models. For image data and other unstructured formats, deep learning models are showing large improvements over prior approaches, but for data already in structured formats, the benefits are less obvious” [italics added].

In this paper, we have shown that recent neural network architectures,

References (70)

R. Colombo et al.
A stochastic RFM model
Journal of Interactive Marketing
(1999)
K. Coussement et al.
Customer churn prediction in the online gambling industry: The beneficial effect of ensemble learning
Journal of Business Research
(2013)
M. George et al.
Maximizing profits for a multi-category catalog retailer
Journal of Retailing
(2013)
F.F. Gönül et al.
Mailing smarter to catalog customers
Journal of Interactive Marketing
(2000)
E.C. Malthouse
Ridge regression and direct marketing scoring models
Journal of Interactive Marketing
(1999)
A. Martínez et al.
A machine learning framework for customer purchase prediction in the non-contractual setting
European Journal of Operations Research
(2020)
W.W. Moe et al.
Capturing evolving visit behavior in clickstream data
Journal of Interactive Marketing
(2004)
D. Agrawal et al.
Market share forecasting: An empirical comparison of artificial neural networks and multinomial logit model
Journal of Retailing
(1996)
E. Ascarza et al.
The perils of proactive churn prevention using plan recommendations: Evidence from a field experiment
Journal of Marketing Research
(2016)
D. Bahdanau et al.
Neural machine translation by jointly learning to align and translate
arXiv preprint
(2014)

A.K. Basu et al.

Modeling the response pattern to direct marketing campaigns

Journal of Marketing Research

(1995)

Y. Bengio

Practical recommendations for gradient-based training of deep architectures

Y. Bengio et al.

Learning long-term dependencies with gradient descent is difficult

IEEE Transactions on Neural Networks

(1994)

W. Ben

2019)

G.R. Bitran et al.

Mailing decisions in the catalog sales industry

Management Science

(1996)

J. Bjorck et al.

Understanding batch normalization

R.C. Blattberg et al.

Database marketing: Analyzing and managing customers. International series in quantitative marketing

(2008)

L. Breiman

Random forests

Machine Learning

(2001)

A.D. Brown et al.

Products of hidden Markov models

J.R. Bult et al.

Optimal selection for direct mail

Marketing Science

(1995)

K. Cho et al.

Learning phrase representations using RNN encoder-decoder for statistical machine translation

arXiv preprint

(2014)

J. Chung et al.

Empirical evaluation of gated recurrent neural networks on sequence modeling

arXiv preprint

(2014)

A. De Bruyn et al.

Artificial intelligence and marketing: Pitfalls and opportunities

Journal of Interactive Marketing

(2020)

G. Dong et al.

Feature Engineering for Machine Learning and Data Analytics

(2018)

B. Donkers et al.

Deriving target selection rules from endogenously selected samples

Journal of Applied Econometrics

(2006)

R. Elsner et al.

Optimizing Rhenania's direct marketing business through dynamic multilevel modeling (DMLM) in a multicatalog-brand environment

Marketing Science

(2004)

P.S. Fader et al.

RFM and CLV: Using iso-value curves for customer base analysis

Journal of Marketing Research

(2005)

J. Friedman et al.

glmnet: Lasso and elastic-net regularized generalized linear models

Y. Gal et al.

Dropout as a bayesian approximation: Representing model uncertainty in deep learning

F.A. Gers et al.

Learning to Forget: Continual Prediction with LSTM

(1999)

F. Gönül et al.

Optimal mailing of catalogs: A new methodology using estimable structural dynamic programming models

Management Science

(1998)

F.F. Gönül et al.

How to compute optimal catalog mailing decisions

Marketing Science

(2006)

I. Goodfellow et al.

Deep Learning

(2016)

A. Graves et al.

Offline handwriting recognition with multidimensional recurrent neural networks

In Advances in neural information processing systems

(2009)

A. Graves et al.

Neural turing machines

arXiv preprint

(2014)

Cited by (46)

Risk assessment of customer churn in telco using FCLCNN-LSTM model
2024, Expert Systems with Applications
Telco players are constantly evaluating their loss in customer revenue or customer churn, especially on the prevention, control, and risk assessment of churn. In this paper, a domain-order based shallow fusion deep learning model, the Fully Connected Layer Convolutional Neural Network - Long Short-Term Memory (FCLCNN-LSTM) is designed and applied to a decision support system to assess the risk of customer churn. First, a majority least absolute shrinkage and selection operator (Maj-LASSO) algorithm is proposed to identify churn predictors that have an important impact on classification. The proposed Maj-LASSO algorithm is different from previous studies that solved the study of unbalanced data and feature selection separately, but solved both as one problem to give the feature selection method under unbalanced data. Then, a fully connected layer based on multiple ReLU neurons is used to balance the importance of the features. Doing so addresses the problem where some key features are overlooked due to their unequal contribution to the risk assessment of telco customer churn. Third, a 2D convolutional neural network based on domain information and an LSTM model based on ordinal information are combined to learn the complex mapping between the features and labels, to improve the accuracy and generalizability of the predictive model. The proposed FCLCNN-LSTM model is different from the previous single spatial model or temporal model, but improves the network classification feature extraction capability from both temporal and spatial dimensions, thus innovatively solving the problem of model development for large-scale and high-dimensional data, and applying it in the field of telecommunication customer churn prediction. Finally, using three public customer churn datasets from the Kaggle and UCI platform, the FCLCNN-LSTM model is compared against other classification models, namely, Logistic Regression, Support Vector Machine, Random Forest, eXtreme Gradient Boosting (XGBoost), Convolutional Neural Network, and Long Short-Term Memory. The experimental results inform that the accuracy of the FCLCNN-LSTM is 3.43% higher than that of the other comparison models, while the AUC is 4.84% higher. Thus, the FCLCNN-LSTM model provides a better decision-making reference for telco players to identify potential churners.
Predicting customer abandonment in recurrent neural networks using short-term memory
2024, Journal of Open Innovation: Technology, Market, and Complexity
Customer retention, a critical business priority, has become a growing concern, especially in the telecommunications industry. This study addresses the need to anticipate and understand customer churn through the application of Deep Learning models. The central focus of the research was the development and evaluation of a short-term memory model (LSTM) specifically designed to predict customer leakage. The choice of LSTM as the mainstay of the research is based on its proven ability to model long-term dependencies in sequences, its resilience to recurrent challenges in neural networks, and its success in various sequence prediction tasks. The model implementation, configured sequentially with Keras, comprised of an initial LSTM layer of 64 units, followed by a 20% removal layer to mitigate overfitting. The second LSTM layer, with 32 units, was supplemented with another elimination layer. Model training was conducted using a dataset consisting of 20 attributes and 4250 records. The model evaluation was based on crucial measures such as precision, accuracy, sensitivity and F1 count, revealing exceptional results with 95% performance on all metrics. This study, therefore, highlights the effectiveness of the LSTM model in predicting customer churn, providing companies with a valuable tool to improve retention and mitigate associated losses.
How does quality-dominant logic ensure marketing analytics success and tackle business failure in industrial markets?
2023, Industrial Marketing Management
Citation Excerpt :
In recent years, marketing analytics has made great progress, with academics and practitioners across industries recognizing it as a modern revolution (Pasha, 2021; Schuuring et al., 2017), a source of market innovation (Iacobucci, Petrescu, Krishen, & Bendixen, 2019), a reason of distinct performance (Kumar & Sharma, 2017; Rahman et al., 2021), or a catalyst to resolving marketing problems (Cao et al., 2019; Wedel & Kannan, 2016). Marketing analytics refers to the digital tools and techniques used to analyse a large volume of marketing data insights to the end of generating value and making appropriate decisions in order to accelerate firm performance (Rahman et al., 2021; Sarkar & De Bruyn, 2021). Since 2016, research on the firm big data and marketing analytics capability stream has repeatedly emphasized managerial support, infrastructure, skills, and knowledgeability (e.g., Akter, Wamba, Gunasekaran, Dubey, & Childe, 2016; Mikalef, Framnes, Danielsen, Krogstie, & Olsen, 2017; Mikalef, Krogstie, Pappas, & Pavlou, 2019; Rahman et al., 2021).
Despite the burgeoning research on business failure in industrial markets, not much has been conducted on the role played by marketing analytics in mitigating such failure in the post-pandemic period. Against this backdrop, this study was aimed at investigating the antecedents of marketing analytics success (MAS) and its overall effects on strategic business value and profitability. Drawing on quality-dominant logic through the dynamic capability lens, this study yielded a model that, on the basis of data, model, and deployment quality, explains the achievement of MAS and the avoidance of business failure. This study, the data for which were gathered in Australia from 314 sample elements, shows that MAS significantly contributes to strategic business value and profitability. Such findings present a theoretically rigorous and practically relevant framework of MAS capable of mitigating the causes of business failure by harnessing analytics insights.
Predicting ammonia nitrogen in surface water by a new attention-based deep learning hybrid model
2023, Environmental Research
Citation Excerpt :
Compared to conventional data-driven models, such as ANNs, LSTM model had a recurrent structure, which could use the previous output as the current input. As reported by a previous study, the LSTM outperformed another 269 manual models when dealing with regression problems (Sarkar and De Bruyn, 2021). Therefore, the LSTM model was chosen as the base model for this study.
Ammonia nitrogen (NH₃–N) is closely related to the occurrence of cyanobacterial blooms and destruction of surface water ecosystems, and thus it is of great significance to develop predictive models for NH₃–N. However, traditional models cannot fully consider the complex nonlinear relationship between NH₃–N and various relative environmental parameters. The long short-term memory (LSTM) neural network can overcome this limitation. A new hybrid model BC-MODWT-DA-LSTM was proposed based on LSTM combining with the dual-stage attention (DA) mechanism and boundary corrected maximal overlap discrete wavelet transform (BC-MODWT) data decomposition method. By introducing attention mechanism, LSTM could selectively focus on the input data. BC-MODWT could decompose the input data into sublayers to determine the main swings and trends of the input feature series. The BC-MODWT-DA-LSTM hybrid model was superior to other studied models with lower average prediction errors. It could maintain NASH Sutcliffe efficiency coefficient (NSE) values above 0.900 under the lead time up to 7 days, and the area under the receiver operating characteristic (ROC) curve could reach 0.992. The hybrid model also had higher prediction accuracies at the peak spots, indicating that it was capable of early warning when sudden high NH₃–N pollution occurred. The high forecasting accuracy of the suggested hybrid method proved that further improving LSTM model without introducing more complex topologies was a promising water quality prediction method.
Customer base analysis with recurrent neural networks
2022, International Journal of Research in Marketing
Citation Excerpt :
To address these questions and to assist managers in designing their marketing programs accordingly, the marketing discipline has produced a rich stream of literature. These contributions include predictive models and techniques for customer targeting and reactivation timing (Gönül & ter Hofstede, 2006; Simester, Sun, & Tsitsiklis, 2006; Holtrop & Wieringa, 2020), market response models for firm- and/or customer-initiated marketing actions (e.g., Hanssens, Parsons, & Schultz (2003), Blattberg, Kim, & Neslin (2008), Sarkar & De Bruyn (2021)), methods for churn prediction and prevention (e.g., Ascarza (2018), Ascarza, Iyengar, & Schleicher (2016), Lemmens & Gupta (2020)), as well as a growing literature on customer valuation (e.g., McCarthy, Fader, & Hardie (2017), McCarthy & Fader (2018)) and customer prioritizing (Homburg, Droll, & Totzek, 2008). However, none of these qualify as a (Swiss Army knife-like) general-purpose problem solver that generalizes across the described decision tasks of managing customer relationships.
One of the primary goals that researchers look to achieve through customer base analysis is to leverage historical records of individual customer transactions and related context factors to forecast future behavior, and to link these forecasts with actionable characteristics of individuals, managerially significant customer sub-groups, and entire cohorts. This paper presents a new approach that helps firms leverage the automatic feature extraction capabilities of a specific type of deep learning models when applied to customer transaction histories in non-contractual business settings (i.e., when the time at which a customer becomes inactive is unobserved by the firm). We show how the proposed deep learning model improves on established models both in terms of individual-level accuracy and overall cohort-level bias. It also helps managers in capturing seasonal trends and other forms of purchase dynamics that are important to detect in a timely manner for the purpose of proactive customer-base management. We demonstrate the model performance in eight empirical real-life settings which vary broadly in transaction frequency, purchase (ir)regularity, customer attrition, availability of contextual information, seasonal variance, and cohort size. We showcase the flexibility of the approach and how the model further benefits from taking into account static (e.g., socio-economic variables, demographics) and dynamic context factors (e.g., weather, holiday seasons, marketing appeals). We make an open-source reference implementation of the newly developed method available at https://github.com/valendin/rfm2lstm.
Marketing analytics capability, artificial intelligence adoption, and firms' competitive advantage: Evidence from the manufacturing industry
2022, Industrial Marketing Management
Citation Excerpt :
The link between marketing analytics or business analytics and a firm's competitiveness has been suggested to be very complicated (Tan, Guo, Cahalane, & Cheng, 2016). However, there is a shortage of conceptual and empirical evidence in the context, primarily focusing on the export-oriented B2B RMG manufacturing firm's aspect (e.g., Kumar & Sharma, 2017; Rahman, Hossain, & Fattah, 2021; Sarkar & De Bruyn, 2021; Wedel & Kannan, 2016). Similarly, a study by Cao et al. (2019) focused on marketing analytics on the B2C aspect, as data evidence exposed from retail and professional services.
Data-driven analytics and artificial intelligence (AI) have become the most crucial aspects of today's industrial marketing management. Although many firms have embraced analytics and AI strategies, corresponding academic advances have been slow. This research investigates how industrial goods manufacturers sustain their competitive advantage in export markets, convincing buyers in a competitive data-rich business environment. The evidence has been taken from the RMG (readymade garment) industry, one of the largest manufacturing industries significantly attached to the export markets. Utilizing multi-phase research design, the study reveals that firms marketing analytics capability play a vital role in sensing, seizing, and reconfiguring the market, consequently leading to a sustained competitive advantage. The performance of sensing, seizing, and reconfiguring becomes higher for a firm when they adopt AI on the strength of the marketing analytics platform. These findings exhibit the latest avenue of exploration within marketing analytics and AI's academic research paradigm. Further, in practice, managers will be aware of the facts that create resilience in this specific industry context.

View all citing articles on Scopus

View full text

LSTM Response Models for Direct Marketing Analytics: Replacing Feature Engineering with Deep Learning

Highlights

Abstract

Introduction

Section snippets

Recurrent Neural Network (RNN)

Bias, Variance, and Model Capacity

Objective

Applications of LSTM Neural Networks in Marketing

Technical Considerations

Conclusions

Journal of Interactive Marketing

Journal of Business Research

Journal of Retailing

Journal of Interactive Marketing

Journal of Interactive Marketing

European Journal of Operations Research

Journal of Interactive Marketing

Market share forecasting: An empirical comparison of artificial neural networks and multinomial logit model

Journal of Retailing

The perils of proactive churn prevention using plan recommendations: Evidence from a field experiment

Journal of Marketing Research

Neural machine translation by jointly learning to align and translate

arXiv preprint

Modeling the response pattern to direct marketing campaigns

Journal of Marketing Research

Practical recommendations for gradient-based training of deep architectures

Learning long-term dependencies with gradient descent is difficult

IEEE Transactions on Neural Networks

2019)

Mailing decisions in the catalog sales industry

Management Science

Understanding batch normalization

Database marketing: Analyzing and managing customers. International series in quantitative marketing

Random forests

Machine Learning

Products of hidden Markov models

Optimal selection for direct mail

Marketing Science

Learning phrase representations using RNN encoder-decoder for statistical machine translation

arXiv preprint

Empirical evaluation of gated recurrent neural networks on sequence modeling

arXiv preprint

Artificial intelligence and marketing: Pitfalls and opportunities

Journal of Interactive Marketing

Feature Engineering for Machine Learning and Data Analytics

Deriving target selection rules from endogenously selected samples

Journal of Applied Econometrics

Optimizing Rhenania's direct marketing business through dynamic multilevel modeling (DMLM) in a multicatalog-brand environment

Marketing Science

RFM and CLV: Using iso-value curves for customer base analysis

Journal of Marketing Research

glmnet: Lasso and elastic-net regularized generalized linear models

Dropout as a bayesian approximation: Representing model uncertainty in deep learning

Learning to Forget: Continual Prediction with LSTM

Optimal mailing of catalogs: A new methodology using estimable structural dynamic programming models

Management Science

How to compute optimal catalog mailing decisions

Marketing Science

Deep Learning

Offline handwriting recognition with multidimensional recurrent neural networks

In Advances in neural information processing systems

Neural turing machines

arXiv preprint