当前位置: X-MOL 学术Comput. Geosci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
3D lithological mapping of borehole descriptions using word embeddings
Computers & Geosciences ( IF 4.4 ) Pub Date : 2020-08-01 , DOI: 10.1016/j.cageo.2020.104516
Ignacio Fuentes , José Padarian , Takuya Iwanaga , R. Willem Vervoort

Abstract In recent years the exponential growth in digital data and the expansion of machine learning have fostered the development of new applications in geosciences. Natural Language Processing (NLP) tackles various issues that arise from using human language data. In this study, NLP is applied to classify and map lithological descriptions in a three dimensional space. The data originates from the Australian Groundwater Explorer dataset of the Bureau of Meteorology, which contains the description and geolocation of bores drilled in New South Wales (NSW), Australia. A GloVe model trained with scientific journal articles and Wikipedia contents related to geosciences was used to obtain embeddings (vectors) from borehole descriptions. In parallel, and as a baseline, the descriptions were classified combining regular expressions and expert criterion. The description embeddings were subsequently classified using a multilayer perceptron neural network (MLP). The performance was evaluated using different accuracy metrics. The embeddings were triangulated and the resulting embeddings were classified using the trained MLP and compared against a nearest neighbour (NN) interpolation of lithological classes. The mapping of the descriptions was carried out by using 3D voxels. Coupling NLP with supervised classification alternatives and interpolation methods resulted in reasonable 3D representation of lithologies. This methodology is a first step in demonstrating the applicability of NLP to the geosciences, which also allows for an uncertainty quantification in the different steps of the process, such as classification and interpolation. Interpolation techniques, although acceptable, might be replaced by machine learning techniques to improve the performance of 3D models.

中文翻译:

使用词嵌入的钻孔描述的 3D 岩性映射

摘要 近年来,数字数据的指数增长和机器学习的扩展促进了地球科学新应用的发展。自然语言处理 (NLP) 解决了因使用人类语言数据而产生的各种问题。在本研究中,NLP 被用于对三维空间中的岩性描述进行分类和映射。该数据源自气象局的澳大利亚地下水资源管理器数据集,其中包含在澳大利亚新南威尔士 (NSW) 钻孔的描述和地理定位。使用科学期刊文章和与地球科学相关的维基百科内容训练的 GloVe 模型用于从钻孔描述中获取嵌入(向量)。同时,作为基线,结合正则表达式和专家标准对描述进行分类。随后使用多层感知器神经网络 (MLP) 对描述嵌入进行分类。使用不同的准确度指标评估性能。嵌入被三角化,得到的嵌入使用经过训练的 MLP 进行分类,并与岩性类的最近邻 (NN) 插值进行比较。描述的映射是通过使用 3D 体素进行的。将 NLP 与监督分类替代方案和插值方法相结合,可以产生合理的岩性 3D 表示。这种方法是证明 NLP 在地球科学中的适用性的第一步,它还允许在过程的不同步骤(例如分类和插值)中进行不确定性量化。插值技术,虽然可以接受,
更新日期:2020-08-01
down
wechat
bug