当前位置: X-MOL 学术Int. J. Coal Geol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Coal elemental (compositional) data analysis with hierarchical clustering algorithms
International Journal of Coal Geology ( IF 5.6 ) Pub Date : 2021-11-09 , DOI: 10.1016/j.coal.2021.103892
Na Xu 1 , Chuanpeng Xu 1 , Robert B. Finkelman 1, 2 , Mark A. Engle 3 , Qing Li 4 , Mengmeng Peng 1 , Lizhi He 1 , Bin Huang 1 , Yuchen Yang 1
Affiliation  

The modes of occurrence for elements in coal are extremely important for deciphering geological process of coal formation and for anticipating the technological behavior and environmental and health impacts derived from coal utilization. Hierarchical clustering algorithm has been widely adopted to investigate the modes of occurrence of elements in coal. The traditional statistics (e.g., Pearson correlation, Euclidean distance) for the elemental data of coal may lead to misinterpretation because the elemental data of coal are of compositional nature and follow the rules of Aitchison geometry. This work applied log-ratio transformations in order to overcome this problem. Different hierarchical clustering algorithms with various data transformations can infer modes of occurrence for coal elements, but which algorithm is optimum deserves to be investigated. In this paper, we discuss four commonly used hierarchical clustering algorithms utilizing pivot coordinates and weighted symmetric pivot coordinates (WSPC), two types of log-ratio transformations, to infer modes of occurrence of elements in coal, based on published coal elemental data. Results showed that the Pearson correlation produces more meaningful results than the Euclidean distance in clustering rare earth elements and Y. WSPC produces more interpretable results than those from pivot coordinates transformed data for these coal elemental data. Compared with the single, complete, and centroid, the average-linkage algorithm is indeed the optimum.



中文翻译:

使用分层聚类算法进行煤元素(成分)数据分析

煤中元素的赋存模式对于破译煤形成的地质过程和预测煤炭利用产生的技术行为以及环境和健康影响极为重要。层次聚类算法已被广泛用于研究煤中元素的赋存模式。煤元素数据的传统统计(如皮尔逊相关、欧几里得距离)可能会导致误解,因为煤元素数据具有成分性质并遵循艾奇逊几何规则。这项工作应用了对数比变换来克服这个问题。具有各种数据转换的不同层次聚类算法可以推断煤元素的出现模式,但哪种算法是最佳的值得研究。在本文中,我们讨论了四种常用的分层聚类算法,利用枢轴坐标和加权对称枢轴坐标 (WSPC),这两种对数比变换,根据已发表的煤元素数据推断煤中元素的出现模式。结果表明,在对稀土元素和 Y 进行聚类时,Pearson 相关比欧几里得距离产生更有意义的结果。WSPC 比这些煤元素数据的枢轴坐标转换数据产生更多可解释的结果。与单一、完整、质心相比,平均联动算法确实是最优的。我们讨论了四种常用的分层聚类算法,它们利用枢轴坐标和加权对称枢轴坐标 (WSPC),这两种对数比变换,根据已发布的煤元素数据推断煤中元素的出现模式。结果表明,在对稀土元素和 Y 进行聚类时,Pearson 相关比欧几里得距离产生更有意义的结果。WSPC 比这些煤元素数据的枢轴坐标转换数据产生更多可解释的结果。与单一、完整、质心相比,平均联动算法确实是最优的。我们讨论了四种常用的分层聚类算法,它们利用枢轴坐标和加权对称枢轴坐标 (WSPC),这两种对数比变换,根据已发布的煤元素数据推断煤中元素的出现模式。结果表明,在对稀土元素和 Y 进行聚类时,Pearson 相关比欧几里得距离产生更有意义的结果。WSPC 比这些煤元素数据的枢轴坐标转换数据产生更多可解释的结果。与单一、完整、质心相比,平均联动算法确实是最优的。结果表明,在对稀土元素和 Y 进行聚类时,Pearson 相关比欧几里得距离产生更有意义的结果。WSPC 比这些煤元素数据的枢轴坐标转换数据产生更多可解释的结果。与单一、完整、质心相比,平均联动算法确实是最优的。结果表明,在对稀土元素和 Y 进行聚类时,Pearson 相关比欧几里得距离产生更有意义的结果。WSPC 比这些煤元素数据的枢轴坐标转换数据产生更多可解释的结果。与单一、完整、质心相比,平均联动算法确实是最优的。

更新日期:2021-11-26
down
wechat
bug