Skip to main content
Log in

Data-based analysis of multimodal partial cavity shedding dynamics

  • Research Article
  • Published:
Experiments in Fluids Aims and scope Submit manuscript

Abstract

Time-resolved X-ray densitometry void fraction measurements and accompanying acoustic emissions have revealed that partial cavity shedding on a hydrofoil can be multimodal, with spontaneous changes in shedding sequence (referred to here as cavitation style) for fixed inlet flow conditions. These spontaneous, intermittent transitions between two physically different shedding styles are examined using data-based image analysis of the cavity flow fields in order to extract the associated physical mechanism leading to style transition. Three data-driven decomposition techniques are compared: proper orthogonal decomposition (POD), dynamic mode decomposition (DMD), and cluster-based reduced-order modeling (CROM), with a primary focus on the latter. The results highlight the utility of CROM over DMD and POD in the context of intermittent event analysis, both in terms of mode interpretability and transition mechanism identification. A frequency-based analysis of the CROM output revealed the existence of a shared harmonic between the different physical cavitation modes which gave further insight to the transition process. The data-based analysis ultimately illuminates the underlying flow mechanism that leads to the style transition, namely the key role of maximum cavity length buildup as it interacts with the vapor cloud collapse downstream of the hydrofoil.

Graphic abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Arndt RE, Song C, Kjeldsen M, He J, Keller A (2001) Instability of partial cavitation: a numerical/experimental approach. In: 23rd Symposium on naval hydrodynamics, Val de Reuil

  • Barwey S, Hassanaly M, An Q, Raman V, Steinberg A (2019) Experimental data-based reduced-order model for analysis and prediction of flame transition in gas turbine combustors. Combust Theory Model 23(6):994–1020

    Article  MathSciNet  Google Scholar 

  • Berkooz G, Holmes P, Lumley JL (1993) The proper orthogonal decomposition in the analysis of turbulent flows. Annu Rev Fluid Mech 25(1):539–575

    Article  MathSciNet  Google Scholar 

  • Bhatt A, W J G H, Ceccio S (2018) Spatially resolved x-ray void fraction measurements of a cavitating naca0015 hydrofoil. In: 32nd Symposium on naval hydrodynamics, Hamburg

  • Callenaere M, Franc J-P, Michel J-M, Riondet M (2001) The cavitation instability induced by the development of a re-entrant jet. J Fluid Mech 444:223–256

    Article  Google Scholar 

  • Cao Y, Kaiser E, Borée J, Noack BR, Thomas L, Guilain S (2014) Cluster-based analysis of cycle-to-cycle variations: application to internal combustion engines. Exp Fluids 55(11):1837

    Article  Google Scholar 

  • Chaves H, Knapp M, Kubitzek A, Obermeier F, Schneider T (1995) Experimental study of cavitation in the nozzle hole of diesel injectors using transparent nozzles. SAE Trans 104(3):645–657

    Google Scholar 

  • Constantin P, Foiaş C, Temam R (1985) Attractors representing turbulent flows, vol 314. American Mathematical Society, Providence

    MATH  Google Scholar 

  • Dular M, Bachert B, Stoffel B, Širok B (2004) Relationship between cavitation structures and cavitation damage. Wear 257(11):1176–1184

    Article  Google Scholar 

  • Franc J-P, Michel J-M (2006) Fundamentals of cavitation, vol 76. Springer, Berlin

    MATH  Google Scholar 

  • Ganesh H (2015) Bubbly shock propagation as a cause of sheet to cloud transition of partial cavitation and stationary cavitation bubbles forming on a delta wing vortex, The University of Michigan

  • Ganesh H, Mäkiharju SA, Ceccio SL (2016) Bubbly shock propagation as a mechanism for sheet-to-cloud transition of partial cavities. J Fluid Mech 802:37–78

    Article  MathSciNet  Google Scholar 

  • Ganesh H, Wu J, Ceccio S (2016) Investigation of cavity shedding dynamics on a naca0015 hydrofoil using time resolved x-ray densitometry. In: 31st Symposium on naval hydrodynamics, ONR

  • Gopalan S, Katz J (2000) Flow structure and modeling issues in the closure region of attached cavitation. Phys Fluids 12(4):895–911

    Article  Google Scholar 

  • Hassanaly M, Raman V (2019a) Ensemble-les analysis of perturbation response of turbulent partially-premixed flames. Proc Combust Inst 37(2):2249–2257

    Article  Google Scholar 

  • Hassanaly M, Raman V (2019b) Lyapunov spectrum of forced homogeneous isotropic turbulent flows. Phys Rev Fluids 4(11):114608

    Article  Google Scholar 

  • Kaiser E, Noack BR, Cordier L, Spohn A, Segond M, Abel M, Daviller G, Östh J, Krajnović S, Niven RK (2014) Cluster-based reduced-order modelling of a mixing layer. J Fluid Mech 754:365–414

    Article  Google Scholar 

  • Kerschen G, Golinval J-C, Vakakis AF, Bergman LA (2005) The method of proper orthogonal decomposition for dynamical characterization and order reduction of mechanical systems: an overview. Nonlinear Dyn 41(1–3):147–169

    Article  MathSciNet  Google Scholar 

  • Kjeldsen M, Arndt RE, Effertz M (2000) Spectral characteristics of sheet/cloud cavitation. J Fluids Eng 122(3):481–487

    Article  Google Scholar 

  • Klus S, Nüske F, Koltai P, Wu H, Kevrekidis I, Schütte C, Noé F (2018) Data-driven model reduction and transfer operator approximation. J Nonlinear Sci 28(3):985–1010

    Article  MathSciNet  Google Scholar 

  • Koopman BO (1931) Hamiltonian systems and transformation in hilbert space. Proc Natl Acad Sci U S A 17(5):315

    Article  Google Scholar 

  • Laberteaux K, Ceccio S (2001) Partial cavity flows. part 1. cavities forming on models without spanwise variation. J Fluid Mech 431:1–41

    Article  Google Scholar 

  • Leroux J-B, Coutier-Delgosha O, Astolfi JA (2005) A joint experimental and numerical study of mechanisms associated to instability of partial cavitation on two-dimensional hydrofoil. Phys Fluids 17(5):052101

    Article  Google Scholar 

  • Mäkiharju SA, Gabillet C, Paik B-G, Chang NA, Perlin M, Ceccio SL (2013) Time-resolved two-dimensional x-ray densitometry of a two-phase flow downstream of a ventilated cavity. Exp Fluids 54(7):1561

    Article  Google Scholar 

  • Mezić I (2005) Spectral properties of dynamical systems, model reduction and decompositions. Nonlinear Dyn 41(1–3):309–325

    Article  MathSciNet  Google Scholar 

  • Mezić I (2013) Analysis of fluid flows via spectral properties of the koopman operator. Annu Rev Fluid Mech 45:357–378

    Article  MathSciNet  Google Scholar 

  • Noack BR, Stankiewicz W, Morzyński M, Schmid PJ (2016) Recursive dynamic mode decomposition of transient and post-transient wake flows. J Fluid Mech 809:843–872

    Article  MathSciNet  Google Scholar 

  • Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65

    Article  Google Scholar 

  • Rowley CW, Mezić I, Bagheri S, Schlatter P, Henningson DS (2009) Spectral analysis of nonlinear flows. J Fluid Mech 641:115–127

    Article  MathSciNet  Google Scholar 

  • Schmid PJ (2010) Dynamic mode decomposition of numerical and experimental data. J Fluid Mech 656:5–28

    Article  MathSciNet  Google Scholar 

  • Schmidt OT, Schmid PJ (2019) A conditional space-time pod formalism for intermittent and rare events: example of acoustic bursts in turbulent jets. J Fluid Mech. https://doi.org/10.1017/jfm.2019.200

    Article  MathSciNet  MATH  Google Scholar 

  • Sirovich L (1987) Turbulence and the dynamics of coherent structures. I. Coherent structures. Q Appl Math 45(3):561–571

    Article  MathSciNet  Google Scholar 

  • Taira K, Brunton SL, Dawson ST, Rowley CW, Colonius T, McKeon BJ, Schmidt OT, Gordeyev S, Theofilis V, Ukeiley LS (2017) Modal analysis of fluid flows: an overview. Aiaa J 55:4013–4041

    Article  Google Scholar 

  • Tissot G, Cordier L, Benard N, Noack BR (2014) Model reduction using dynamic mode decomposition. Comptes Rendus Mécanique 342(6–7):410–416

    Article  Google Scholar 

  • Towne A, Schmidt OT, Colonius T (2018) Spectral proper orthogonal decomposition and its relationship to dynamic mode decomposition and resolvent analysis. J Fluid Mech 847:821–867

    Article  MathSciNet  Google Scholar 

  • Tu JH, Rowley CW, Luchtenburg DM, Brunton SL, Kutz JN (2013) On dynamic mode decomposition: theory and applications. arXiv preprintarXiv:1312.0041

  • Wu J, Ganesh H, Ceccio S (2019) Multimodal partial cavity shedding on a two-dimensional hydrofoil and its relation to the presence of bubbly shocks. Exp Fluids 60(4):66

    Article  Google Scholar 

Download references

Funding

The funding was grant by Air Force Office of Scientific Research (Grant No. FA9550-16-1-0309).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shivam Barwey.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Proper orthogonal decomposition

POD is described here in the context of the method of snapshots (Sirovich 1987). This procedure outputs an optimal set of N orthogonal spatial modes based on an eigendecomposition of a covariance matrix. If \(\langle {{\mathcal {X}}} \rangle \in {\mathbb {R}}^{N_p}\) denotes the time average of the snapshots, the snapshot fluctuation matrix is \({\tilde{{{\mathcal {X}}}}} = X - \langle {\mathcal {X}} \rangle\). The covariance matrix is then \(C = {\tilde{{\mathcal {X}}}}^{T} {\tilde{{{\mathcal {X}}}}} \in {\mathbb {R}}^{N \times N}\). The eigendecomposition of C is given by

$$\begin{aligned} C \mathbf{a}_p = \lambda _p \mathbf{a}_p, \quad p = 1, \ldots , N. \end{aligned}$$
(7)

The eigenvectors \(\mathbf{a}_p\) of C are the temporal coefficients of the pth spatial mode. To recover the spatial modes from these coefficients, the forward transform is utilized. This is given by

$$\begin{aligned} \mathbf{u}_p = \frac{1}{N}\sum _{i = 1}^{N} a_{p,i} \mathbf{x}_i, \quad p = 1, \ldots , N, \end{aligned}$$
(8)

where \(\mathbf{u}_p\) is the POD mode and \(a_{p,i}\) is the ith component of the vector \(\mathbf{a}_p\). The eigenvalue \(\lambda _p\) represents the energy contribution to the flow from mode \(\mathbf{u}_p\) —the mode corresponding to the highest eigenvalue of C thus captures the most amount of flow energy. In POD-based analysis, assuming that the modes are arranged by their eigenvalues in descending order, the first K modes are kept such that a sufficiently high amount of the flow energy is resolved. The energy captured by the first K modes is easily accessed by the sum of the first K eigenvalues of C. These high-energy modes can then be used in physical analysis, filtering, or reduced-order descriptions. If these K modes are contained in the first K columns of the matrix \(U \in {\mathbb {R}}^{N_p \times K}\), it can be shown that the POD formulation solves the following minimization problem:

$$\begin{aligned} \min _U ||{\tilde{{\mathcal {X}}}} - U U^T {\tilde{{\mathcal {X}}}} ||_F, \quad \text {s.t.} \quad U^T U = I, \end{aligned}$$
(9)

where F is the Frobenius norm and I is the size N identity matrix. In other words, POD provides an orthogonal set of modes in U that optimally reconstructs the snapshot set \({\tilde{{\mathcal {X}}}}\). The reconstruction error vanishes when \(K = N\).

Appendix 2: Dynamic mode decomposition

Consider an arbitrarily defined observable function (which can be scalar or vector-valued) of the state, \(g(\mathbf{x}_s)\). The Koopman operator \({\mathcal {U}}\) of the dynamical system is the linear propagator of any such observable,

$$\begin{aligned} {\mathcal {U}}g(\mathbf{x}_s) = g(\varvec{{\mathcal {F}}}(\mathbf{x}_s)). \end{aligned}$$
(10)

If an observable function \(\phi _j\) satisfies the time evolution given by

$$\begin{aligned} {\mathcal {U}} \phi _j(\mathbf{x}_s) = \lambda _j \phi _j(\mathbf{x}_s), \end{aligned}$$
(11)

then the function \(\phi _j\) is called a Koopman eigenfunction. A primary assumption in DMD is that the Koopman operator can be described purely by a discrete spectrum. That is, any observable is contained within the span of the eigenfunctions of the Koopman operator and can be expressed by the expansion

$$\begin{aligned} g(\mathbf{x}_s) = \sum _{j=1}^{\infty } \phi _j(\mathbf{x}_s) \mathbf{v}_j, \end{aligned}$$
(12)

where \(\mathbf{v}_j \in {\mathbb {R}}^{N_p}\) is called the Koopman mode. If such an expansion is valid, then the evolution of the observable can be expressed directly in terms of the Koopman modes and eigenvalues as

$$\begin{aligned} {{\mathcal {U}}}^{T} g(\mathbf{x}_s) = g(\mathbf{x}_{s+T}) = \sum _{j=1}^{\infty } \lambda _j^T \mathbf{v}_j, \end{aligned}$$
(13)

where the initial condition information \(\phi _j(\mathbf{x}_s)\) has been absorbed into the Koopman mode. The discrete spectrum assumption is valid if the dynamics described by \(\varvec{{\mathcal {F}}}\) is a truly periodic or linear system. However, to extend Koopman analysis to nonlinear systems that are quasi-periodic or chaotic, there is also a continuous portion of the \({\mathcal {U}}\) spectrum that is orthogonal to the space spanned by the eigenfunctions that should be accounted for in the expansion of \(g(\mathbf{x}_s)\). For more on this topic, see Mezić (2005, 2013). The standard DMD algorithms produce only the discrete component.

Using the identity observable function \(g(\mathbf{x}_s) = \mathbf{x}_s\), one can connect the Koopman expansion above to the dynamics of the system \(\varvec{{\mathcal {F}}}\). It can be shown that if \(\varvec{{\mathcal {F}}}\) is linear, and the dynamics is described by a linear propagator \(A: {\mathbb {R}}^{N_p} \mapsto {\mathbb {R}}^{N_p}\) where \(\mathbf{x}_{s+1} = A \mathbf{x}_s\), then the operator A is the Koopman operator and the eigenvectors of A are the Koopman modes \(\mathbf{v}j\) (Mezić 2013; Rowley et al. 2009).

The underlying concept of DMD comes from this connection between the Koopman operator and the linear operator A. Since the actual system \(\varvec{{\mathcal {F}}}\) is in fact nonlinear, no true A exists for the system. In standard DMD algorithms, which deal with the identity observable function, a linear operator \(A_{DMD}\) is instead approximated from the nonlinear data using regression techniques. Once such an \(A_{DMD}\) is recovered, its eigenvalues and eigenvectors provide the DMD temporal coefficients and spatial modes, respectively, which approximate the Koopman eigenvalues and modes from Eq. 13.

If the data in \({\mathcal {X}}\) are arranged in snapshot pairs such that \({{\mathcal {X}}}_1 = [\mathbf{x}_1, \ldots , \mathbf{x}_{N-1}] \in {\mathbb {R}}^{N_p \times (N-1)}\) and \({{\mathcal {X}}}_2 = [\mathbf{x}_2, \ldots , \mathbf{x}_{N}] \in {\mathbb {R}}^{N_p \times (N-1)}\) (i.e., \({{\mathcal {X}}}_2 = \varvec{{\mathcal {F}}}(\mathcal{X}_1\))), then \(A_{DMD}\) can be recovered by finding a linear map between \({{\mathcal {X}}}_1\) and \({{\mathcal {X}}}_2\) in the least squares sense. This is accomplished with the pseudo-inverse of \({{\mathcal {X}}}_1\): \(A_{DMD} = {{\mathcal {X}}}_1^{+} {{\mathcal {X}}}_2\), where \(+\) denotes the pseudo-inverse. The DMD expansion is expressed as

$$\begin{aligned} \mathbf{x}_i = \sum _{p=1}^{N-1} d_{p,i} \mathbf{v}_p, \end{aligned}$$
(14)

where the temporal coefficient \(d_{p,i}\) is the coefficient associated with DMD mode \(\mathbf{v}_p\) for snapshot i. As implied by Eq. 13, the eigenvectors of \(A_{DMD}\) yield the \(\mathbf{v}_p\), and the eigenvalues of provide the constant-frequency temporal coefficients \(d_{p,i}\). Since performing an eigendecomposition on \(A_{DMD}\) can be very inefficient in the case where \(N_p \gg N\), several methods exist to compute the eigendecomposition by instead solving an \((N-1) \times (N-1)\) eigenvalue problem. One such method known as “exact DMD” is used here (Tu et al. 2013).

For physical interpretation, it is not realistic to analyze all DMD modes. However, mode selection in DMD is not as clear-cut as with POD—common metrics are to use either (1) growth rate (keeping the modes which decay the least over successive operations of \(A_{DMD}\)), or (2) mode amplitude (selecting the modes with the highest \(L_2\) norm). Exclusively, using mode amplitude as the criteria for mode selection might result in choosing a mode active for a very short amount of time. Similarly, selecting the mode based on growth factor alone may lead to a mode that is slowly decaying but of minimal amplitude. To address such issues, the method used here combines both mode amplitude and temporal coefficient decay by attributing the energy of mode p as the average energy of \({\mathcal {X}}\) captured by mode p (Tissot et al. 2014). This amounts to the time-averaged energy of the inverse transform due to mode p alone, which is an ensemble average over the snapshots

$$\begin{aligned} E_{j} = \frac{1}{N} \sum _{i=1}^{N} || d_{j,i} \mathbf{v}_j ||_2^2. \end{aligned}$$
(15)

The plot of DMD mode energy versus mode frequency is one way to view the DMD spectrum, and its relative peaks can be used to isolate DMD modes of interest.

Appendix 3: Comparison of the CROM analysis with POD and DMD

The POD/DMD modes are obtained using the methods described in Appendices 1 and 2. Using the CROM labeling procedure described in Sect. 4, the modes for POD and DMD will also be labeled. The first six POD modes and their coefficients are shown in the first two rows of Fig. 12. Their respective energies are indicated by \(\varepsilon\) in the figure. There is a considerable drop-off in energy after the third mode. One labeling pathway is available when considering the POD temporal coefficients among the high-energy modes, i.e., one can begin to deduce which modes correspond to which cavitation style by assessing temporal coefficient activations. For example, a visual assessment of the POD mode 0 coefficient informs that it is likely associated with Style 1 cavitation, as the coefficient amplitude appears to drop as it enters the Style 2 regime. Similarly, for POD mode 2, its coefficient amplitude appears to activate as it enters the Style 2 regime and deactivate as it enters Style 1; POD mode 2 may therefore be more indicative of the Style 2 dynamics. However, to establish the POD mode labels in a more complete manner, frequency assignments are made in a similar fashion as in the CROM procedure using the PSD curves of the associated model coefficients. The frequency peaks are shown in the third row in Fig. 12. The same four peak locations as identified in Fig. 8 appear throughout the various spectra of the POD PSD curves for the first few modes.

A labeling of the first three (high-energy) POD modes using a combination of their PSD peaks and coefficient time series can now be made. Since mode 0 (highest energy) activates considerably at peak 1 with a secondary activation at peak 0, it is assigned to Style 1—this confirms the above-described implications from the corresponding coefficient time series. POD mode 1 activates primarily at peak 2 with a secondary activation at peak 1. Based on the information provided from the CROM procedure in Fig. 9, this POD mode may be attributed to either Style 2 cavitation or a shared state. (The POD mode 1 spectrum is similar to centroids 4 and 5, which are shared, and also to centroid 7, which is Style 2.) The delineation is provided in its temporal coefficient: Since the POD mode 1 coefficient does not appear to deactivate during the entry into either style regime, it is assigned the shared label. Lastly, POD mode 2 activates primarily at peak 2, with a secondary activation at peak 3. Mode 2 is therefore assigned to the Style 2 label, since a significant spike at peak 3 indicates unique Style 2 dynamics as found from the analysis of Fig. 9. As mentioned above, this is evidenced directly by the temporal coefficient magnitude increase in the Style 2 regime. In summary, the concrete assignments are then POD mode 0 \(\rightarrow\) Style 1, POD mode 1 \(\rightarrow\) Shared, and POD mode 2 \(\rightarrow\) Style 2.

The next three POD modes (3, 4, and 5) are less straightforward to label based on their time series and PSD curves alike, as they are an order of magnitude lower in energy. Interestingly, these low-energy modes all observe high PSD concentration at a very low-frequency peak (lower than peak 0). Disregarding this lower peak, POD mode 3 activates at both peaks 1 and 2, while mode 4 activates significantly at peak 3. These modes could be indicative of the lower-energy fluctuations associated with Styles 1 and 2, respectively. The spectrum of POD mode 5 introduces a new peak near St=0.47, but similar to mode 4, observes a highest contribution at peak 3. This may imply the presence of a different dominant harmonic in the Style 2 regime for the low-energy dynamics. Despite this, interpreting the physical significance of the spatial structures for these low-energy modes is difficult. The complexity of these low-energy modes could be problematic with regard to identifying transition-related characteristics, as the physics associated with the transition itself is stored in the low-energy POD modes when considering the whole dataset.

Fig. 12
figure 12

(Top row) First 6 POD modes with energy \(\varepsilon\) indicated in the upper left. Blue is negative, and red is positive. (Middle row) Temporal coefficients with blue vertical lines indicating transition points. (Bottom row) Normalized PSD curves for the temporal coefficients

Similar to a temporal discrete Fourier transform, in DMD each mode is assigned to a unique frequency. It is this one-to-one mapping between DMD mode and frequency that renders the DMD labeling process much less straightforward than CROM and POD counterparts. Based on a priori knowledge of the cavitation dynamics, Style 1 occurs predominantly near \(\hbox {St}=0.15\) and Style 2 near \(\hbox {St} = 0.30\). However, based on the centroid and POD PSD results, it is clearly seen that even in light of these dominant frequencies, the styles are inherently multi-frequency in nature. See, for example, centroid 0 from Fig. 9. This is a Style 1 centroid, which indeed observes a dominant peak at the expected frequency of \(\hbox {St}=0.15\), yet there is another smaller activation at peak 2. The same is true for the Style 2 centroids and corresponding POD modes in Fig. 12. Additionally, the existence of a shared state between the two cavitation styles is problematic with regard to DMD mode labeling, since it occupies two unique peaks in the frequency spectrum (peaks 1 and 2). Thus, a single DMD mode cannot represent the shared state, and one-to-one labels for DMD modes and cavitation styles may be misleading.

With this in mind, the DMD results are shown in Fig. 13. Three relative peaks are seen at St = 0.15, 0.29, and 0.60 (labeled DMD modes 0, 1, and 2, respectively) which line up very closely with peaks 1, 2, and 3 in the CROM PSD curve in Fig. 8 and the POD PSD curves in Fig. 12. Despite the fact that a single frequency cannot capture the cavitation style dynamics in their entirety, the high-energy DMD modes can instead be regarded as primary building blocks for the shedding cycles in each style. For example, DMD mode 0, which is the high-energy mode defined by St=0.15, is regarded as a primary contributor to Style 1 dynamics. Similarly, DMD mode 1 is a primary building block for the Style 2 dynamics, but may also represent a portion of the Style 1 dynamics since the shared harmonic also activates primarily at this frequency. DMD mode 2 is a unique contributor to the Style 2 cycle.

Since the input dataset is real-valued, the DMD modes are complex. The temporal coefficients are also complex—they are constant-frequency sinusoids whose amplitude may either grow or decay with time. Thus, the modes are best visualized by performing an inverse transform and instead visualizing this transform (i.e., mode A is visualized by reconstructing the original data using only mode A). This allows one to visualize the effect of the DMD mode in the context of its full periodic cycle. As such, periodic cycles for the DMD modes found from Fig. 13 are shown in eight-step increments in Fig. 14. Because the magnitude of the DMD modes in relation to the magnitude of the mean flow is quite small, the cycle effects are difficult to interpret physically (this is a shortcoming of DMD in this application). DMD mode 0, however, appears to show the full cavity length fill-up associated with Style 1 dynamics. Modes 1 and 2 may indicate the effects of trailing edge vapor cloud shedding, particularly in Step 5. Since the temporal coefficients (shown in Fig. 13) are powers of complex exponentials, they do not “activate” in time as do the POD temporal coefficients. This is a disadvantage in applications which observe intermittent transitions. A solution to this has been presented in the form of recursive DMD; though this formulation provides a POD-like representation of the coefficients that better capture intermittent transience, the modes are still inherently constant frequency in nature (Noack et al. 2016).

Fig. 13
figure 13

DMD spectrum based on energy defined in Sect. 6. Temporal coefficients for identified modes shown on right

Fig. 14
figure 14

DMD modes from Fig. 13 as periodic cycles over eight timesteps. Mode cycles include the mean flow, which is shown at the bottom-most row

Appendix 4: Choosing \(N_k\)

With the goal of satisfying the conditions posed in the end of Sect. 3, one pathway to obtain a value for \(N_k\) is to assess the cluster time series plots. These are the temporal coefficients in the CROM framework, i.e., they show the evolution of the trajectory in time by associating each snapshot with its closest centroid. This can be thought of as a symbolic representation of the dynamics and can be used to find an effective lower bound on \(N_k\) such that the clustering sufficiently captures both cavitation styles. Cluster time series plots are shown in Fig. 15 for two \(N_k\) values for illustrative purposes. Recall from Sect. 2 that the flow transitions from Style 1 to Style 2 at roughly t = 0.35 s, and back again to Style 1 at t = 0.65 s. These transition regions are highlighted in blue in Fig. 15.

The unsupervised clustering algorithm can obtain these distinctions between macroscopic states in the cavitating flow. Even in the case of extremely low cluster numbers (\(N_k = 3\)), the cluster time series indicates changes in characteristic frequency around \({t} = 0.35\,\hbox {s}\) and \({t} = 0.65\,\hbox {s}\), delineating the cavitation style transitions. The increase in \(N_k\) shows the potential of the clustering algorithm to isolate each cavitation style beyond characteristic frequencies. For example, increasing \(N_k\) from 3 to 12 shows how both cavitation styles occupy distinct regions of the system phase space. Though the isolation of each style becomes more pronounced with even higher \(N_k\) values, for this study, \(N_k = 12\) is used. This is due to the fact that only 787 snapshots are available in the dataset. Since the centroids are regional averages, their statistical uncertainties are related directly to the number of snapshots contained in the corresponding clusters, or the visitation time of the trajectory in the clusters (Barwey et al. 2019). Therefore, increasing cluster number to higher values decreases the model statistical certainty at the expense of a refined representation of the phase space. To ensure the visitation time in each cluster is long enough, the minimum number of snapshots in each cluster was set to 30. This produces an acceptable 12-cluster model which compromises phase space refinement with high-enough cluster visitation time, producing a good solution to the conditions discussed in Sect. 3. The snapshot-cluster visitations are visualized in the cluster partitions, termed Voronoi cells, in the bottom row of Fig. 15. These Voronoi cells are constructed in a two-dimensional POD space for ease of visualization. The red markers indicate cluster centroids, and the gray markers indicate the snapshots.

Fig. 15
figure 15

(Top) Cluster time series plots for \(N_k = 3\) and 12. Approximate points of macroscopic state transition are indicated in by blue lines. (Bottom) Voronoi partitions with respect to the first two POD coefficients. Centroids shown in red, snapshots in gray

A valid concern is the lack of a purely quantitative index for the determination of number of clusters. It should be noted that in any non-hierarchical clustering method such as K-means, the selection of the number of clusters \(N_k\) is always an issue and is an area of active research. However, for the sake of demonstration, the outputs of two common techniques for driving cluster number selection will be shown: (1) the K-means objective function, known as the within-cluster sum of squares (WCSS), and (2) the Silhouette method. First, the WCSS and Silhouette method will be defined. Then, the output of each as a function of \(N_k\) will be shown.


Within-cluster sum of squares: The WCSS is the objective function of K-means and is defined as

$$\begin{aligned} WCSS = \sum _{i=1}^{N_k} \sum _{\mathbf{x} \in {\mathbb {K}}_i} \left\Vert \mathbf{x} - C_i\right\Vert _2^2, \end{aligned}$$
(16)

where \(\mathbf{x}\) is a snapshot belonging to cluster \({\mathbb {K}}_i\), and \(C_i\) is the corresponding ith centroid. The WCSS is essentially a measure of the total variance contained within each cluster, aggregated over all clusters. For a given \(N_k\) and a set of initial centroid locations, the K-means algorithm guarantees a minimization of the WCSS to within a small threshold. For cluster number determination, WCSS can be plotted as a function of the number of clusters \(N_k\). Though the resulting curve is guaranteed to be monotonically decreasing (WCSS converges to zero as \(N_k\) tends to the total number of snapshots), a cluster number which showcases a significant drop-off in WCSS is sought. This cluster number, if it exists, is deemed “optimal” in the sense that it reduces the WCSS by a significant amount relative to the maximum. This is similar to plotting mode energies in POD, where the mode at which a significant energy drop-off occurs is a candidate for the truncation point.


Silhouette score: The silhouette score is a measure for each snapshot in the dataset, contained with the range \([-1,1]\), that summarizes how well the assigned centroid represents its flow structure. High silhouette scores indicate a better centroid representation (or a “good” cluster assignment for that snapshot), so the goal is to find a peak in the curve of average silhouette score over all snapshots N versus cluster number \(N_k\). For the ith snapshot (\(i=1,\ldots ,N\)), the silhouette score is given by

$$\begin{aligned} s(i) = \frac{b(i) - a(i)}{\max {(a(i), b(i))}}. \end{aligned}$$
(17)

In Eq. 17, the quantity a(i) is the average distance between the ith snapshot and all other snapshots in the same cluster—it can be considered as measure for within-cluster variation. The quantity b(i) is the average distance between the ith snapshot and all other snapshots in the neighboring cluster, where the neighboring cluster is defined as the cluster that snapshot i would have been assigned to in the case its existing cluster were to be removed. As explained in Rousseeuw (1987), if s(i) is above zero, the current cluster represents the snapshot better than the neighboring cluster. The opposite is true if s(i) is below zero. Thus, the quantity s(i) is a measure of how well the ith snapshot is represented by the clustering. If the s(i) scores are averaged over all N snapshots, one obtains a scalar metric for the clustering quality. The goal is to choose the \(N_k\) that maximizes this average silhouette score. Complete details can be found in the original work of Rousseeuw (1987).

Figure 16a and b (black curves) show the Silhouette measure and the WCSS values, respectively, as a function of \(N_k\). The colored curves show the effect of POD truncation on the same curves.

Fig. 16
figure 16

a Silhouette score versus \(N_k\), colors correspond to different amounts of retained energy due to POD preprocessing. b Normalized K-means objective (WCSS) versus \(N_k\), colors correspond to different amounts of retained energy due to POD preprocessing. c Cavitation style probabilities for each centroid using \(N_k=10\). Probability threshold indicated by dotted line as in original manuscript. Uncertain clusters circled in red on x-axis. d Same as c, but for \(N_k=12\). e Same as c, but for \(N_k=14\)

The silhouette score (Fig. 16a) is maximized at the smallest tested \(N_k\) of 2. There is no noticeable increase in silhouette score afterward, though there is a smooth decay as cluster number is increased. In other words, the silhouette score guides the user toward very low cluster numbers. It is clear from the cluster time series of Fig. 15 in the original manuscript that \(N_k = 2\) or 3 is not ideal, as it would create too coarse a description of the dynamics and no style separation would be provided. We therefore conclude that the silhouette metric, a popular metric used in data science applications, is not useful here. The WCSS (Fig. 16a) decays with increasing \(N_k\) as expected. However, there is no noticeable drop (or “elbow”) in the curve—the objective function decreases smoothly with increasing \(N_k\). This renders the WCSS useless for guiding cluster number selection in this application.

Figure 16a and b also shows the effect of POD truncation on the silhouette and WCSS measures. The silhouette curve shifts up with decreasing energy retained (corresponding to increasing smoothing or fewer POD modes), and the WCSS score decreases with decreasing energy retained. This implies that the within-cluster variation is reduced with larger amounts of POD truncation, which is expected in the event of reduced spatial fluctuations in the data. Despite this, the curves still do not provide useful information—the silhouette curves still point to lower cluster numbers and the WCSS does not display a clear drop at a particular \(N_k\). Since the POD truncation does not give clear indication through these scalar metrics, it can be concluded to good confidence that reducing dimensionality via POD does not move the dataset toward a clustering-friendly manifold. Additionally, cautionary notes related to POD truncation as a preprocessing tool are provided in Sect. 3.

Ultimately, both silhouette and WCCS approaches were inconclusive and produced disappointing results—they favor very low cluster numbers, which would create too coarse a dynamical description for this application. This is because such scalar metrics are designed for low-dimensional datasets that exhibit true clustering patterns, where the quality of the clustering output can be determined based on label accuracy computed using class assignments known a priori. However, in the application considered here, there is no notion of cluster accuracy because there is no reason to believe a “true” clustering exists. Instead, the focus is on interpretability of the clustering output through the centroids, a facet rarely considered in data science applications. For example, in the context of flow decomposition, the clustering algorithm is not used to “discover” true clusters in a sense. It is instead used to partition the phase space in which the observed system resides, facilitating an interpretable decomposition of the dynamics through the centroids and transition paths. Because of these reasons, the driver for cluster number selection should not be a quantity like the silhouette metric; this is reflected directly in the results in Fig. 16a and b. Instead, the cluster number should be driven by optimizing centroid interpretability in the context of macroscopic flow transitions (i.e., through the inspection of the cluster time series).

In this regard, to better illustrate why the chosen \(N_k=12\) was acceptable, style probabilities from the clustering output (akin to Fig. 6, see Sect. 4 for details) are compared for \(N_k = 10\) (Fig1. 16c), 12 (Fig. 16d), and 14 (Fig. 16e). The output in the \(N_k=10\) case results in more clusters that are uncertain with regard to style representation (circled red on the x-axis). This is an artifact of a coarse dynamical description. On the other hand, the output in the \(N_k=14\) case gives a similar result to the \(N_k=12\) case in that there are two shared centroids. The results for \(N_k=14\) could in fact have been used; however, it simply adds an additional centroid for both styles that provides nothing particularly useful to the analysis. Further, the \(N_k=14\) case results in a minimum cluster population close to 20 instead of 30, which in turn affects the statistical significance of the centroids. From these results, it is apparent that \(N_k=12\) is a good candidate.

Appendix 5: Regarding the number of observed transitions

Since there are only two style transitions observed in the data, one may question the validity of the analysis conducted in Sect. 5 as a whole. More transitions are of course desired, but due to experimental limitations with the void fraction densitometry setup, only a small time window for recording void fraction field measurements was accessible. Further, it was necessary for the transitions to come from the same stream of data at fixed operating conditions (i.e., ergodic, spontaneous transitions). With these constraints, the dataset that most clearly exhibited both Style \(1\,\rightarrow \,2\) and Style \(2\,\rightarrow \,1\) transitions in the short-time window available was chosen. A higher number of transitions for a single recording period was not attainable.

To demonstrate the validity of the method, the analysis was applied to a different dataset (lower cavitation number) which also contains style transitions. This new void fraction dataset was collected at the same sampling frequency, domain region, and time duration as the original. Furthermore, it contains two transitions (Style \(1\, \rightarrow \,2\) and \(2\,\rightarrow \,1\)), similar to the original dataset.

The leftmost plot of Fig. 17 shows the corresponding transition matrix for the new dataset, with style labels outlined in a similar manner as Fig. 10 for the original dataset (for brevity, the centroid labeling step for the new dataset is not shown). The most probable pathways for each style transition (indicated by the respective arrows on the transition matrix figure) are shown in the middle and right of Fig. 17. For both Style \(1\,\rightarrow \,2\) and Style \(2\,\rightarrow \,1\) transitions, the most probable transition pathways from the new dataset are juxtaposed with the same quantities from the original dataset (extracted from Fig. 11). Although slight differences in these key centroids exist (likely due to differences in operating conditions), the transitioning centroids across datasets share similar structure, and as a result, point to similar physical phenomena.

For example, the Style \(1\,\rightarrow \,2\) transition (middle of Fig. 17) using the new dataset highlights to the same pinch-off feature (centroid 10 in new dataset) that evolves into cavity growth with a shed cloud near the trailing edge, a characteristic of the Style 2 regime. The Style \(2\,\rightarrow \,1\) transition (right of Fig. 17) is characterized in by slightly increased cavity length with no shed clouds near the trailing edge, which almost exactly resembles the trends seen in the original data. Ultimately, this observed mechanism consistency produced by the CROM output across different datasets is a form of validation for the analysis conducted in the paper, in light of the fact that only a few transitions are contained within the endpoints of the data.

Fig. 17
figure 17

(Left) Transition matrix using the new dataset (colorbar axis is same as Fig. 13). Blue and green enclosed regions indicate Style 1 and 2 centroids, respectively. Arrows indicate most probable transition paths. (Middle) Comparison of new output to original output for most probable Style 1 to Style 2 transitions. (Right) (Middle) Comparison of new output to original output for most probable Style 2 to Style 1 transitions

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Barwey, S., Ganesh, H., Hassanaly, M. et al. Data-based analysis of multimodal partial cavity shedding dynamics. Exp Fluids 61, 98 (2020). https://doi.org/10.1007/s00348-020-2940-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00348-020-2940-x

Navigation