当前位置: X-MOL 学术Comput. Geosci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Web-based machine learning tool that determines the origin of natural gases
Computers & Geosciences ( IF 4.2 ) Pub Date : 2020-12-01 , DOI: 10.1016/j.cageo.2020.104595
John E. Snodgrass , Alexei V. Milkov

Abstract Investigations on the origin of natural gases traditionally involve manual plotting of values for various geochemical parameters on binary gas genetic diagrams and the comparison of these values with empirically defined gas genetic fields. However, these fields considerably overlap, and the accuracy and uncertainty on the derived origin are not quantified. To overcome these issues, we developed a web-based tool powered by a machine learning model that determines the origin of natural gases. The utilized large global dataset of natural gases (27,852 samples) includes 10,937 samples which we manually interpreted and labeled with one of the five gas origins (thermogenic, primary microbial from CO2 reduction, primary microbial from methyl-type fermentation, secondary microbial, and abiotic). The supervised machine learning model uses random forest algorithm to classify natural gas samples based on four features (geochemical parameters CH4/(C2H6+C3H8), δ13C–CH4, δ2H–CH4 and δ13C–CO2). The model determines the origin of gases in samples with unknown origin accompanied by model accuracy and the confidence score for each possible origin. The model is deployed on the website www.gasorigin.com with a simple user-friendly interface. The incorporation of more data, geochemical parameters (model features) and determination of post-generation processes are the subjects of future developments.

中文翻译:

确定天然气来源的基于网络的机器学习工具

摘要 对天然气来源的调查传统上涉及在二元气体成因图上手动绘制各种地球化学参数的值,并将这些值与经验定义的气体成因场进行比较。然而,这些领域有相当大的重叠,并且派生起源的准确性和不确定性没有量化。为了克服这些问题,我们开发了一种基于网络的工具,该工具由确定天然气来源的机器学习模型提供支持。使用的大型全球天然气数据集(27,852 个样本)包括 10,937 个样本,我们手动解释并标记了五种气体来源之一(产热、二氧化碳还原产生的主要微生物、甲基型发酵产生的主要微生物、次生微生物和非生物)。监督机器学习模型使用随机森林算法根据四个特征(地球化学参数 CH4/(C2H6+C3H8)、δ13C–CH4、δ2H–CH4 和 δ13C–CO2)对天然气样本进行分类。该模型确定来源不明的样本中的气体来源,并附有模型准确性和每个可能来源的置信度分数。该模型部署在网站 www.gasorigin.com 上,具有简单的用户友好界面。纳入更多数据、地球化学参数(模型特征)和确定后生成过程是未来发展的主题。该模型确定来源不明的样本中的气体来源,并附有模型准确性和每个可能来源的置信度分数。该模型部署在网站 www.gasorigin.com 上,具有简单的用户友好界面。纳入更多数据、地球化学参数(模型特征)和确定后生成过程是未来发展的主题。该模型确定来源不明的样本中的气体来源,并附有模型准确性和每个可能来源的置信度分数。该模型部署在网站 www.gasorigin.com 上,具有简单的用户友好界面。纳入更多数据、地球化学参数(模型特征)和确定后生成过程是未来发展的主题。
更新日期:2020-12-01
down
wechat
bug