当前位置: X-MOL 学术Forum Math. Sigma › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Product formalisms for measures on spaces with binary tree structures: representation, visualization, and multiscale noise
Forum of Mathematics, Sigma ( IF 1.2 ) Pub Date : 2020-11-13 , DOI: 10.1017/fms.2020.40
Devasis Bassu , Peter W. Jones , Linda Ness , David Shallcross

In this paper, we present a theoretical foundation for a representation of a data set as a measure in a very large hierarchically parametrized family of positive measures, whose parameters can be computed explicitly (rather than estimated by optimization), and illustrate its applicability to a wide range of data types. The preprocessing step then consists of representing data sets as simple measures. The theoretical foundation consists of a dyadic product formula representation lemma, and a visualization theorem. We also define an additive multiscale noise model that can be used to sample from dyadic measures and a more general multiplicative multiscale noise model that can be used to perturb continuous functions, Borel measures, and dyadic measures. The first two results are based on theorems in [15, 3, 1]. The representation uses the very simple concept of a dyadic tree and hence is widely applicable, easily understood, and easily computed. Since the data sample is represented as a measure, subsequent analysis can exploit statistical and measure theoretic concepts and theories. Because the representation uses the very simple concept of a dyadic tree defined on the universe of a data set, and the parameters are simply and explicitly computable and easily interpretable and visualizable, we hope that this approach will be broadly useful to mathematicians, statisticians, and computer scientists who are intrigued by or involved in data science, including its mathematical foundations.

中文翻译:

具有二叉树结构的空间度量的乘积形式:表示、可视化和多尺度噪声

在本文中,我们提出了一个理论基础,用于将数据集表示为一个非常大的分层参数化的正度量族中的度量,其参数可以显式计算(而不是通过优化估计),并说明其适用于广泛的数据类型。然后,预处理步骤包括将数据集表示为简单的度量。理论基础包括二元乘积公式表示引理和可视化定理。我们还定义了一个可用于从二元测量中采样的加性多尺度噪声模型和一个更通用的可用于扰动连续函数、Borel 测量和二元测量的乘性多尺度噪声模型。前两个结果基于 [15, 3, 1] 中的定理。该表示使用二元树的非常简单的概念,因此具有广泛的适用性、易于理解和易于计算。由于数据样本被表示为一种度量,因此后续分析可以利用统计和度量理论概念和理论。由于该表示使用在数据集的全域上定义的非常简单的二元树概念,并且参数可以简单明确地计算并且易于解释和可视化,我们希望这种方法对数学家、统计学家和对数据科学(包括其数学基础)感兴趣或参与其中的计算机科学家。随后的分析可以利用统计和测量理论概念和理论。由于该表示使用在数据集的全域上定义的非常简单的二元树概念,并且参数可以简单明确地计算并且易于解释和可视化,我们希望这种方法对数学家、统计学家和对数据科学(包括其数学基础)感兴趣或参与其中的计算机科学家。随后的分析可以利用统计和测量理论概念和理论。由于该表示使用在数据集的全域上定义的非常简单的二元树概念,并且参数可以简单明确地计算并且易于解释和可视化,我们希望这种方法对数学家、统计学家和对数据科学(包括其数学基础)感兴趣或参与其中的计算机科学家。
更新日期:2020-11-13
down
wechat
bug