当前位置: X-MOL 学术ACM Trans. Knowl. Discov. Data › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An Exponential Factorization Machine with Percentage Error Minimization to Retail Sales Forecasting
ACM Transactions on Knowledge Discovery from Data ( IF 3.6 ) Pub Date : 2021-01-04 , DOI: 10.1145/3426238
Chongshou Li 1 , Brenda Cheang 2 , Zhixing Luo 3 , Andrew Lim 4
Affiliation  

This article proposes a new approach to sales forecasting for new products (stock-keeping units [SKUs]) with long lead time but short product life cycle. These SKUs are usually sold for one season only, without any replenishments. An exponential factorization machine (EFM) sales forecast model is developed to solve this problem which not only takes into account SKU attributes, but also pairwise interactions. The EFM model is significantly different from the original Factorization Machines (FM) from two fold: (1) the attribute-level formulation for explanatory/input variables; and (2) exponential formulation for the positive response/output/target variable. The attribute-level formation excludes infeasible intra-attribute interactions and results in more efficient feature engineering comparing with the conventional one-hot encoding, while the exponential formulation is demonstrated more effective than the log-transformation for the positive but not skewed distributed responses. In order to estimate the parameters, percentage error squares (PES) and error squares (ES) are minimized by a proposed adaptive batch gradient descent method over the training set. To overcome the over-fitting problem, a greedy forward stepwise feature selection method is proposed to select the most useful attributes and interactions. Real-world data provided by a footwear retailer in Singapore are used for testing the proposed approach. The forecasting performance in terms of both mean absolute percentage error (MAPE) and mean absolute error (MAE) compares favorably with not only off-the-shelf models but also results reported by extant sales and demand forecasting studies. The effectiveness of the proposed approach is also demonstrated by two external public datasets. Moreover, we prove the theoretical relationships between PES and ES minimization, and present an important property of the PES minimization for regression models; that it trains models to underestimate data. This property fits the situation of sales forecasting where unit-holding cost is much greater than the unit-shortage cost (e.g., perishable products).

中文翻译:

具有百分比误差最小化的零售销售预测指数分解机

本文提出了一种新的方法来预测新产品(库存单位 [SKUs])的销售预测,这种新产品的交货时间长但产品生命周期短。这些 SKU 通常只销售一季,没有任何补货。为了解决这个问题,开发了一种指数因子分解机 (EFM) 销售预测模型,该模型不仅考虑了 SKU 属性,还考虑了成对交互。EFM 模型与原始因子分解机 (FM) 有两个显着不同:(1)解释/输入变量的属性级公式;(2) 正响应/输出/目标变量的指数公式。与传统的 one-hot 编码相比,属性级形成排除了不可行的属性内交互,并导致更有效的特征工程,而对于正但不偏斜的分布响应,指数公式被证明比对数变换更有效。为了估计参数,百分比误差平方 (PES) 和误差平方 (ES) 通过在训练集上提出的自适应批量梯度下降方法最小化。为了克服过拟合问题,提出了一种贪婪的前向逐步特征选择方法来选择最有用的属性和交互。新加坡一家鞋类零售商提供的真实数据用于测试提议的方法。就平均绝对百分比误差 (MAPE) 和平均绝对误差 (MAE) 而言,预测性能不仅优于现成模型,而且与现有销售和需求预测研究报告的结果相比较。两个外部公共数据集也证明了所提出方法的有效性。此外,我们证明了 PES 和 ES 最小化之间的理论关系,并提出了回归模型的 PES 最小化的一个重要性质;它训练模型以低估数据。该属性适合销售预测的情况,即单位持有成本远大于单位短缺成本(例如,易腐烂的产品)。
更新日期:2021-01-04
down
wechat
bug