当前位置: X-MOL 学术J. Comput. Phys. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Uncertainty quantification for data-driven turbulence modelling with mondrian forests
Journal of Computational Physics ( IF 4.1 ) Pub Date : 2021-01-11 , DOI: 10.1016/j.jcp.2021.110116
Ashley Scillitoe , Pranay Seshadri , Mark Girolami

Data-driven turbulence modelling approaches are gaining increasing interest from the CFD community. Such approaches generally aim to improve the modelled Reynolds stresses by leveraging data from high fidelity turbulence resolving simulations. However, the introduction of a machine learning (ML) model introduces a new source of uncertainty, the ML model itself. Quantification of this uncertainty is essential since the predictive capability of a data-driven model diminishes when predicting physics not seen during training. In this work, we explore the suitability of Mondrian forests (MF's) for data-driven turbulence modelling. MF's are claimed to possess many of the advantages of the commonly used random forest (RF) machine learning algorithm, whilst offering principled uncertainty estimates. An example test case is constructed, with a turbulence anisotropy constant derived from high fidelity turbulence resolving simulations. A number of flows at several Reynolds numbers are used for training and testing. MF predictions are found to be superior to those obtained from a linear and non-linear eddy viscosity model. Shapley values, borrowed from game theory, are used to interpret the MF predictions. Predictive uncertainty is found to be large in regions where the training data is not representative. Additionally, the MF predictive uncertainty is found to exhibit stronger correlation with predictive errors compared to an a priori statistical distance measure, which indicates it is a better measure of prediction confidence. The MF predictive uncertainty is also found to be better calibrated and less computationally costly than the uncertainty estimated from applying jackknifing to random forest predictions. Finally, Mondrian forests are used to predict the Reynolds discrepancies in a convergent-divergent channel, which are subsequently propagated through a modified CFD solver. The resulting flowfield predictions are in close agreement with the high fidelity data. A procedure for sampling the Mondrian forests' uncertainties is introduced. Propagating these samples enables quantification of the uncertainty in quantities of interest such as velocity or a drag coefficient, due to the uncertainty in the Mondrian forests' predictions. This work suggests that uncertainty quantification can be incorporated into existing data-driven turbulence modelling frameworks by replacing random forests with Mondrian forests. This would also open up the possibility of online learning, whereby new training data could be added without having to retrain the Mondrian forests.



中文翻译:

用蒙德里安森林进行数据驱动的湍流建模的不确定性量化

数据驱动的湍流建模方法越来越受到CFD界的关注。这些方法通常旨在通过利用来自高保真度湍流解析模拟的数据来改善建模的雷诺应力。但是,机器学习(ML)模型的引入引入了不确定性的新来源,即ML模型本身。这种不确定性的量化至关重要,因为在预测训练过程中看不到的物理场时,数据驱动模型的预测能力会下降。在这项工作中,我们探索了蒙德里安森林(MF's)在数据驱动的湍流建模中的适用性。MF拥有常规随机森林(RF)机器学习算法的许多优点,同时提供了有原则的不确定性估计。构建了一个示例测试用例,从高保真度湍流解析模拟中得出的湍流各向异性常数。使用多个雷诺数的流量进行训练和测试。发现MF预测优于从线性和非线性涡流粘度模型获得的预测。从博弈论中借用的Shapley值用于解释MF预测。发现在训练数据不具有代表性的地区,预测不​​确定性很大。另外,与先验统计距离度量相比,发现MF预测不确定性与预测误差表现出更强的相关性,这表明它是更好的预测置信度度量。还发现,MF预测不确定性的标定性比通过套叠技术应用于随机森林预测所估计的不确定性更好的校准和更低的计算成本。最后,使用蒙德里安森林来预测会聚-发散通道中的雷诺差异,然后通过修改后的CFD求解器传播该差异。最终的流场预测与高保真度数据非常吻合。介绍了抽样蒙德里安森林不确定性的程序。由于蒙德里安森林预测的不确定性,传播这些样本可以量化感兴趣的数量(例如速度或阻力系数)的不确定性。这项工作表明,可以通过将随机森林替换为蒙德里安森林,将不确定性量化合并到现有的数据驱动的湍流建模框架中。这也将打开在线学习的可能性,从而可以添加新的培训数据而不必重新培训蒙德里安森林。

更新日期:2021-01-11
down
wechat
bug