Elsevier

Computers & Fluids

Volume 214, 15 January 2021, 104770
Computers & Fluids

Multiresolution classification of turbulence features in image data through machine learning

https://doi.org/10.1016/j.compfluid.2020.104770Get rights and content

Highlights

  • Intermediate data products for large simulations allow for meaningful analysis.

  • An image-space descriptor for the detection and extraction of vortex-like features.

  • A complete training and classification system enables low-cost evaluation of data.

  • Vortex features are detected at multiple sizes and scales through machine learning.

Abstract

During large-scale simulations, intermediate data products such as image databases have become popular due to their low relative storage cost and fast in-situ analysis. Serving as a form of data reduction, these image databases have become more acceptable to perform data analysis on. We present an image-space detection and classification system for extracting vortices at multiple scales through wavelet-based filtering. A custom image-space descriptor is used to encode a large variety of vortex-types and a machine learning system is trained for fast classification of vortex regions. By combining a radial-based histogram descriptor, a bag of visual words feature descriptor, and a support vector machine, our results show that we are able to detect and classify vortex features at various sizes at multiple scales. Once trained, our framework enables the fast extraction of vortices on new, unknown image datasets for flow analysis.

Introduction

Today’s supercomputers allow scientists to simulate turbulent phenomena using extremely high-resolution grids, producing massive data sets that make it possible to gain new insights into complex turbulent behavior at multiple scales. Unfortunately, there exists a large disparity between compute flops (CPU) and I/O capabilities. This gap has made it unfeasible to save the massive amounts of data generated onto non-volatile space in a reasonable amount of time. These limitations make feature extraction and analysis difficult when performed after a simulation, especially when temporal resolution is low relative to the simulation. It is common to only produce simulation-state outputs at regular intervals of a simulation which incur a huge cost. These state files are often the only data points used for data analysis and feature extraction. These outputs can be hundreds of gigabytes large, requiring large amounts of resources to process them after the simulation has completed. Recent advancements in analysis and visualization techniques have introduced cross-domain in-situ methods that run alongside large simulations, producing reduced-scale intermediate data products (images) that are more reasonable to manage for feature detection and extraction.

In the classical 3D turbulence picture, unsteady vortices of many sizes appear and interact with each other, generating smaller and smaller vortices, up to the scale where viscous effects dominate. The quantitative description of this process has been one of the central objectives of turbulence research, with many questions still open, for example, related to anomalous scaling exponents and intermittency. In order to assist with the understanding of vorticity dynamics, traditionally, vortex extraction in turbulent flows has been a well-studied subject.

Some of the earliest definitions of vortices were given by Jeong et al. [1], where interactions between coherent structures play a role in the development of vortex dynamics. Subsequently, newer topology methods were used to define vortex behavior by Post et al. [2], primarily categorizing feature extraction methods into four types: direct flow, texture-based flow, geometric flow, and feature-based flow visualization.

There have been methods developed and used to detect and extract vortex structures in 2D flow. The Okubo-Weiss method [3], [4] has been successfully used on native 2D hydrodynamics and magnetohydrodynamics. The Q-criterion has been used in 3D flows  but poses issues when used on 2D datasets. Vortices that are not aligned to a camera’s plane-of-view cannot be visually characterized as a vortex. While it is ideal to run many of these detection methods on the original, raw dataset, sizes of output data oftentimes require significant resources that match those where the simulation was executed. More complex methods become prohibitive due to their overhead requirements, and without a priori knowledge of feature locations may need to be processed at a global scale, i.e. computing derivative and tensor products from stored velocity fields. Additionally, our integration with the Cinema framework produces many 2D image-sets as light-weight data products rather than 3D data. For these reasons, we have opted to use the Okubo-Weiss method for 2D training and validation of our classification system.

Aside from the connection with turbulence in general, vortex classification might be particularly useful for understanding the large roll-ups and structure of the flow generated by the Rayleigh-Taylor, Richtmyer-Meshkov, and Kelvin-Helmholtz instabilities. Recently, Zhou has provided a comprehensive review of flow, turbulence, and mixing induced by these instabilities [5], [6], [7]. The method proposed here might help extend some of the analysis approaches surveyed in the review. In addition, experimental data results are often only available as 2-D images. In applications such as ICF, where the Rayleigh-Taylor/Richtmyer-Meshkov instabilities are important [5], [6], [7], vortex classification may offer a novel tool for flow analysis.

One of the most popular intermediate data products produced today are Cinema databases [8]. Cinema databases most commonly store image-space representations of the data at many different camera perspectives, allowing for instantaneous access to many views of a specific dataset while a simulation is running for post-analysis. For exploratory research, it is sometimes difficult to know prior to the simulation what specific features to look for especially when running many variations of parameters. Furthermore, for massive datasets it is difficult and expensive to save many full-size datasets at fine temporal resolutions. The Cinema framework allows for fine temporal, image outputs of large-scale simulations at a significantly reduced cost. The advent of well-defined image-space features makes Cinema an ideal choice but detection methods must be developed. Recent work by Banesh et al. [9] has been able to successfully use an edge-based contour eddy tracking method on temporal, 2D image data of turbulent ocean currents. Additional tracking and evaluation of these results were done by Gospodnetic et al. [10].

Vortices are typically described by their mathematical behavior, but also contain well defined visual footprints making them an ideal feature to detect and extract in image-space environments. Performing meaningful large-scale extraction of visual features in diverse fields has traditionally been tackled by the use of machine learning algorithms. Without the need of discrete definitions of features, machine learning algorithms take multiple sets of inputs with weighted descriptors to then separate into several types, or classifications. Work in turbulence area is not substantial, but there are several sample works related to turbulence-like datasets.

Zhang et al. [11] used several methods to boost detection results in machine learning for vortex detection. By using local vortex detection algorithms, termed as weak classifiers, and an expert-in-the-loop approach for labeling results, they show to have reduce misclassification rates compared to component classifiers. Similar to this approach, our framework has the capability to enable an expert-in-the-loop approach for labeling but focus on the use of the Okubo-Weiss classifier for this work. Kwon et al. [12] used a learning algorithm to evaluate and select the best image-space representations for large scientific data and was able to show significant improvements compared to manual selections. For the extraction and classification of vortex features in image space, the use of machine learning could significantly improve the efficiency of the extraction. For example, features that would be unclassified by traditional methods could be learned and correctly identified using machine learning techniques.

In this paper, we develop a classification system that can automatically identify, describe, and extract features primary to turbulence datasets in image-space. The focus is on the extraction of vortices, due to their importance for turbulence research. Vortices are also some of the most visually distinct features available. Once trained, our classification system strictly works on image datasets without the use of the original data scalar components. Our contributions are the following:

  • an image-space descriptor that operates in linear space for the detection and extraction of vortex-like features;

  • a complete training and classification system that enables the low-cost evaluation of image-space datasets.

Data used in this paper is from the public Johns Hopkins Turbulence Database [13]. While the JHTDB hosts many turbulence datasets, the one used corresponds to a Direct Numerical Simulation (DNS) of homogeneous buoyancy-driven turbulence (HBDT) [14] on a 10243 periodic grid. This simulation solves the incompressible Navier-Stokes equations for two miscible fluids with different densities, in a triply periodic domain [15], [16], [17], [18]. Both fluids are initialized as random blobs using a characteristic size of about 1/5 of the domain, consistent with the homogeneity assumption. Starting at rest, constant gravitational acceleration causes the fluids to move in opposite directions due to differential buoyancy forces. As turbulence fluctuations are generated and the turbulent kinetic energy increases, features of interest begin to emerge. Nevertheless, stirring by turbulence increases the rate of molecular mixing. After some time, molecular mixing becomes large enough that the buoyancy forces are overcome by dissipation and turbulence starts to decay. Due to the assistance of the buoyancy forces, the turbulence decay is different than classical decay. Visually, one interesting phenomenon is the classification of number and size of vortices that form along a multi-fluid boundary.

The remainder of the paper is organized as follows. Section 2 briefly describes concepts related to the proposed method, which is presented in Section 3. Experimental results are described and discussed in Section 4. Section 5 presents the conclusions and directions for future exploration of the topic.

Section snippets

Background

In this section, we present some concepts related to the proposed method.

Method

Fig. 1 presents a brief summary of our pipeline. The offline stage comprises the selection of features, the creation of a dictionary to describe what is and what is not a vortex-like feature, and the final training of a classifier. The classifier is then used to predict the class of previously unseen features. Section 3.1 describes how image features were selected by means of the Okubo-Weiss measure in order to compose a dataset for training. Sections 3.2 and 3.3 discuss, respectively, the

Experiments and results

In the following we present experiments and results achieved by our method. The experiments were performed using turbulence data from the Johns Hopkins Turbulence Database [13], which consists of the velocity components at 1015 time instances in a volume of 1024 × 1024 × 1024 voxels. The time instances are separated by a constant time step of 0.04 and cover the initial state, growth, and long time decay of turbulence. For training, we used 2D slices from 11 time instances (one slice per time

Conclusions

As current and future supercomputers produce insurmountable amounts of data year-over-year, it has become necessary to use approaches for data reduction to alleviate bandwidth limitations and facilitate the study of new simulations. The generation of Cinema image databases has been a recent popular approach capable of generating high temporal fidelity snapshots of simulations that otherwise would be skipped by conventional restart state writes. Performing data analysis on these massive restart

CRediT authorship contribution statement

Jesus Pulido: Conceptualization, Methodology, Software. Ricardo Dutra da Silva: Methodology, Software, Writing - original draft. Daniel Livescu: Data curation, Writing - original draft. Bernd Hamann: Writing - review & editing, Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work has been co-authored by employees of Triad National Security, LLC which operates Los Alamos National Laboratory (LANL) under Contract no. 89233218CNA000001 with the U.S. Department of Energy/National Nuclear Security Administration. D.L. acknowledges funding from the Laboratory Directed Research and Development (LDRD) program at LANL under project 20190059DR.

References (35)

  • Y. Zhou

    Rayleigh–Taylor and Richtmyer–Meshkov instability induced flow, turbulence, and mixing. I

    Phys Rep

    (2017)
  • Y. Zhou

    Rayleigh–Taylor and Richtmyer–Meshkov instability induced flow, turbulence, and mixing. II

    Phys Rep

    (2017)
  • J. Jeong et al.

    On the identification of a vortex

    J Fluid Mech

    (1995)
  • F.H. Post et al.

    The state of the art in flow visualisation: feature extraction and tracking

    Comput Graphics Forum

    (2003)
  • Shivamoggi B.K., Heijst G.J.F., Kamp L.P.J.. The Okubo-Weiss criteria in two-dimensional hydrodynamic and...
  • Y.L. Chang et al.

    Analysis of STCC eddies using the Okubo–Weiss parameter on model and satellite data

    Ocean Dyn

    (2014)
  • Y. Zhou et al.

    Turbulent mixing and transition criteria of flows induced by hydrodynamic instabilities

    Phys Plasmas

    (2019)
  • J. Ahrens et al.

    An image-based approach to extreme scale in situ visualization and analysis

    SC ’14: proceedings of the international conference for high performance computing, networking, storage and analysis

    (2014)
  • D. Banesh et al.

    Extracting, visualizing and tracking mesoscale ocean eddies in two-dimensional image sequences using contours and moments

    Workshop on visualisation in environmental sciences (EnvirVis)

    (2017)
  • P. Gospodnetic et al.

    Ocean current segmentation at different depths and correlation with temperature in a MPAS-ocean simulation

    Workshop on scientific visualisation (SciVis) IEEE VIS

    (2018)
  • L. Zhang et al.

    Boosting techniques for physics-based vortex detection

    Comput Graph Forum

    (2014)
  • Kwon O.H., Crnovrsanin T., Ma K.L.. What would a graph look like in this layout? A machine learning approach to large...
  • Y. Li et al.

    A public turbulence database cluster and applications to study Lagrangian evolution of velocity increments in turbulence

    J Turbul

    (2008)
  • Livescu D., Canada C., Kanov K., Burns R., IDIES, Pulido J.. Homogeneous buoyancy driven turbulence data set. 2014. Los...
  • D. Livescu et al.

    Buoyancy-driven variable-density turbulence

    J Fluid Mech

    (2007)
  • D. Livescu et al.

    Variable-density mixing in buoyancy-driven turbulence

    J Fluid Mech

    (2008)
  • D. Livescu

    Numerical simulations of two-fluid turbulent mixing at large density ratios and applications to the Rayleigh–Taylor instability

    Phil Trans R Soc A

    (2013)
  • Cited by (4)

    • Multifractal Image Texture Analysis Combined with 2D Empirical Mode Decomposition

      2023, 2023 4th International Seminar on Artificial Intelligence, Networking and Information Technology, AINIT 2023
    View full text