当前位置: X-MOL 学术Comput. Chem. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Cluster analysis of crude oils with k-means based on their physicochemical properties
Computers & Chemical Engineering ( IF 3.9 ) Pub Date : 2021-12-09 , DOI: 10.1016/j.compchemeng.2021.107633
A. Sancho 1, 2 , J.C. Ribeiro 2 , M.S. Reis 3 , F.G. Martins 1
Affiliation  

The values of the physicochemical properties of crude oils vary significantly, depending on their geographical origins. A standard categorization of crude oils is grossly based on the density and sulfur content, not considering other properties that can have meaningful impacts on blending and in some refining processes. Cluster analysis is an unsupervised machine learning technique that categorizes observations based on their similarity. In this work, k-means clustering algorithm was applied to a wide range of physicochemical properties to identify groups of crudes oils with high affinity that possibly have similar behavior later on, in downstream operations.

A data set from Galp SA refineries (located in Portugal) containing 454 observations, corresponding to values of 9 properties, from 45 different crude oil sources was used in the present analysis. After suitable preprocessing, k-means was applied using different cluster numbers, and their performance was evaluated through the internal validation metrics silhouette index and Local Cores-based Cluster Validity (LCCV) index. The recommend number of clusters was 3, which presented the best performance with a LCCV index of 0.39. Crude oils from the same source should be incorporated in the same cluster, and this was corroborated by external validation, with 1.8% of the observations were placed in a different cluster than the majority of same source crude oils. The proposed method was also able to identify observations with unusually high iron contents concerning the same source of crude oils when more clusters were considered.

This work provides a methodology to obtain a better categorization of crude oils by using cluster analysis, allowing the refineries to know how similar crude oils and their sources are. This categorization is very useful for improving the formulation of crude blends and the crude oils quality control, with the goal to optimize further the refining operations.



中文翻译:

基于原油理化性质的k均值聚类分析

原油的物理化学特性值因地理来源的不同而有很大差异。原油的标准分类主要基于密度和硫含量,不考虑可能对混合和某些精炼过程产生有意义影响的其他特性。聚类分析是一种无监督的机器学习技术,它根据观察的相似性对观察进行分类。在这项工作中,k-means 聚类算法应用于广泛的物理化学特性,以识别具有高亲和力的原油组,这些原油组可能在下游操作中具有类似的行为。

本分析使用了来自 Galp SA 炼油厂(位于葡萄牙)的数据集,其中包含来自 45 种不同原油来源的 454 项观测值,对应于 9 种特性的值。经过适当的预处理后,使用不同的聚类数应用 k-means,并通过内部验证指标轮廓指数和基于局部核心的聚类有效性 (LCCV) 指数评估其性能。推荐的簇数为 3,表现最佳,LCCV 指数为 0.39。来自同一来源的原油应包含在同一集群中,这得到了外部验证的证实,与大多数相同来源的原油相比,1.8% 的观察结果被放置在不同的集群中。

这项工作提供了一种方法,可通过使用聚类分析对原油进行更好的分类,使炼油厂了解原油及其来源的相似程度。这种分类对于改进原油混合物的配方和原油质量控制非常有用,目的是进一步优化精炼操作。

更新日期:2021-12-20
down
wechat
bug