当前位置: X-MOL 学术Comput. Stat. Data Anal. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Latent association graph inference for binary transaction data
Computational Statistics & Data Analysis ( IF 1.5 ) Pub Date : 2021-03-27 , DOI: 10.1016/j.csda.2021.107229
David Reynolds , Luis Carvalho

A novel approach to the problem of statistical inference for multivariate binary transaction data is proposed. A fundamental question that arises from this data, often referred to as market basket data, is how the items relate to one another. These relationships are naturally expressed by a graph and transactions can be modelled as samples of cliques from this association graph. A hierarchical model is developed that follows from this generative idea, along with an MCMC sampling procedure that handles large datasets and allows inference on a broad set of parameters. This model provides a sparser representation of associations between items as compared with frequent itemset mining (FIM) output, without sacrificing predictive accuracy. Additionally, by allowing inference on a broad set of parameters, the model provides a deeper level of insight into transaction data. Empirical results are provided on applications of this model to simulated data and real transaction data from Instacart.



中文翻译:

二进制交易数据的潜在关联图推断

提出了一种解决多元二元交易数据统计推断问题的新方法。由这些数据(通常称为市场篮子数据)引起的一个基本问题是,项目之间如何相互关联。这些关系自然由图表示,并且可以将交易建模为来自该关联图的集团样本。根据这种生成思想,开发了一个层次模型,以及一个处理大型数据集并允许推断大量参数的MCMC采样程序。与频繁项目集挖掘(FIM)输出相比,该模型提供了项目之间关联的稀疏表示,而不会牺牲预测准确性。此外,通过推断一系列参数,该模型可以更深入地了解交易数据。在将此模型应用于来自Instacart的模拟数据和真实交易数据时,提供了经验结果。

更新日期:2021-03-27
down
wechat
bug