Elsevier

Ocean Modelling

Volume 163, July 2021, 101819
Ocean Modelling

On the selection of time-varying scenarios of wind and ocean waves: Methodologies and applications in the North Tyrrhenian Sea

https://doi.org/10.1016/j.ocemod.2021.101819Get rights and content

Highlights

  • A methodology for the selection of metocean scenarios is reviewed and discussed.

  • K-means is applied to 40 years of hindcast data in front of the Ligurian coastline.

  • The assessment of the optimal number of scenarios is crucial though not trivial.

  • The proposed methodology allows to catch the relevant modes of the local wave climate.

Abstract

This paper analyses a methodology to identify sub-series of waves parameters and wind speed able to explain the overall variability of the input dataset. To this end, the K-means clustering technique is applied to a 40-year long time series of hindcast data off the Genoa coastline (NW Italy). K-means aims to group the data in a reduced number of clusters, represented by as many modes of variability (or “model scenarios”). This work reviews and discusses a methodology to select time-varying model scenarios and assess the performances of K-means according to two indexes and increasing number of clusters. These indexes are used to compute the number of clusters best suited for the application at hand, testing different conditions as concerns the variables involved in the analysis and their temporal resolution. Results show that the indexes may not be consistent with each other, and that the number of scenarios to be reasonably employed strongly depends on how data are initially assembled. Finally, some of the model scenarios selected in front of Genoa are analysed and discussed in the framework of the local wind-wave climatology.

Introduction

There are several environmental processes taking place in the nearshore zone that are of foremost interest from both scientific and technical points of view such as storm surges coastal circulation, beach evolution and pollutant dispersion among many others (a few recent examples can be found e.g. in Wu et al., 2018, Briganti et al., 2018, Enrile et al., 2019, de Ruggiero et al., 2020). A detailed characterization of the local climatology of the sea is therefore crucial to plan the human activities in coastal areas (Zheng et al., 2020, Losada et al., 2019, Venancio et al., 2020, Khelil et al., 2019) and to handle hazards and emergency situations (Cheung et al., 2003, Samaras et al., 2016, Kerguillec et al., 2019, Mahendra et al., 2011, Silva et al., 2017).

In this respect, the use of numerical models for the simulation and forecasting of marine and coastal hydrodynamics has dramatically increased over the last decades, especially thanks to the exponential growth in the power of modern supercomputers. A wide range of numerical models has been developed for the analysis, simulation and resolution of geophysical fluid dynamics problems, allowing to efficiently describe the processes driven by the upper-sea physics, for example beach morphodynamics (Larson et al., 1987, Reeve, 2006, Castelle et al., 2015, Vousdoukas et al., 2012, Coco et al., 2014, Roelvink et al., 2009), littoral currents (Watanabe, 1982, Kaliraj et al., 2014, Lee et al., 2014), and ocean waves growth and propagation (Tolman et al., 2016, Booij et al., 1997, López et al., 2015, Browne et al., 2007, Wornom et al., 2001).

The characterization of these phenomena is generally based on huge amount of information, the processing of which requires high computational powers and times, especially if the data at hand come from a climate reanalysis service, characterized by fine resolutions in time and space. Nevertheless, top computational performances are not always available, thus it may be advisable to reduce the number of environmental conditions to be retained for further numerical simulations and analysis. The detection of the most significant modes of variability of a time series can help to pursue this goal. Indeed, the use of techniques for the reduction of data dimensionality, and in particular clustering algorithms (Hastie et al., 2009, Wilks, 2011, Anderberg, 2014), allows to focus on a limited number of subsets to capture most of the variability of a time series, reducing in turn the computational load required to run a modelling chain. Cluster analysis partitions input data into modes or clusters based on the selected distance measure (e.g. Euclidean, Minkowski, etc.), each cluster representing a group of elements that are similar to each other and are dissimilar from the elements of another cluster. More specifically, this methodology aims to simultaneously minimize the distance between members of a given cluster and maximize the distance between the centres or centroids of the clusters (Michelangeli et al., 1995, Jain et al., 1999, Coggins et al., 2014, Fučkar et al., 2016). The result of a clustering analysis is a subset of elements that summarizes the initial dataset while maintaining its main properties. Note that in the previous literature the clusters centroids have been also referred to as “model scenarios”; hereinafter, they will be referred to simply as “scenarios” for the sake of brevity. Scenarios can highlight the main characteristics of the dataset under investigation, which may be difficult to appreciate if only looking at its overall distribution, especially in case of huge multi-variate datasets.

There exist several clustering methods, and the previous literature present different applications for the identification of environmental conditions, with different specific objectives that concern not only the analysis of geophysical processes but also their numerical simulation. A thorough review of the available techniques can be found in Barbakh et al. (2009), which outlines how in general the clustering method should be selected according to the analysis that is meant to be carried out.

As far as met-ocean variables are concerned, Hadzimejlic et al. (2012) introduced a general workflow to clustering atmospheric states through the use of hierarchical methods; such methods were also used by Euán and Sun (2019) to develop a method for the clustering of time series of directional spectra.  Camus et al. (2011a) used clustering techniques for the propagation of sea waves in the coastal zone and Enríquez et al. (2020) applied clustering techniques to characterize the spatial patterns of storm surges off the global coastline. McLaughlin et al. (2003) and Bárcena et al. (2015) performed clustering analysis for the simulation of three-dimensional hydrodynamics of estuaries, and Núñez et al. (2019) employed clustering analysis to assess the probability of marine litter accumulation in estuaries.

The latter studies are based on the K-means algorithm, which is one of the most popular unsupervised clustering techniques and allows to group the initial database in K classes and as many centroids (scenarios), each representing the respective class (i.e. cluster; the term “unsupervised” means that the data do not need to be labelled prior to performing the clustering, MacQueen et al., 1967, Friedman et al., 2001). The use of this particular method allows to select scenarios that, on average, catch the properties of the data belonging to the respective clusters, as the K-means aims precisely at minimizing the intra-cluster variance. However, different methods could be employed if different data selection were meant to be carried out. For instance, the Maximum Dissimilarity Algorithm may be more appropriate than K-means to describe extreme conditions of the input variables (Camus et al., 2011b).

K-means has the advantage that it is relatively easy to implement and guarantees convergence; on the other hand, one of the drawbacks is that the number of clusters must be manually chosen. The aforementioned works defined the number of clusters to be used according to indexes that allow to assess the goodness of the clustering with respect to the total data. Herein, the number of clusters best suited for the application at hand will be referred to as “optimal”. However, the use of particular indexes to establish the optimal number of scenarios may sometimes be misleading, and the indexes employed be not always consistent among each other. In this respect, the present work resumes a methodology for the selection of time-varying scenarios of met-ocean variables (in this work meant as wind and wave data). K-means analysis is here applied to select groups of climatological scenarios from hindcast data in front of Genoa (Ligurian coastline, NW of Italy), comparing the optimal number of scenarios according to two different indexes: the model efficiency (CE) and the total variance (W2), proposed by Nash and Sutcliffe (1970) and Wilks (2011), respectively. In particular, the analysis is conducted for varying initial conditions of the problem, such as time resolutions of the data and the number of variables employed. Then, some of the clusters selected according to a specific initial set of the hindcast time series are analysed and evaluated in the context of the local climatology of the area.

The paper is structured as follows: Section 2 presents the hindcast data and the methodology used in the study. Section 3 presents and discusses the results, while the conclusions are drawn in Section 4.

Section snippets

Hindcast data

This work took advantage of wave and wind data hindcasted off the Genoa coastline, in the North-Tyrrhenian Sea (north west of Italy). An overview of the area is shown in panel A) of Fig. 1, while in panel B) a close-up on the hindcast node used is shown. In this study, the node 230 (i.e. Point_000230) was considered among all the grid nodes in the Mediterranean Sea provided by the hindcast service of the Department of Civil, Chemical and Environmental Engineering of the University of Genoa

Optimal number of clusters due to the selection of the initial dataset

The examples presented in this section allow to evaluate how the selection of the starting dataset may affect the number of required scenarios, defined according to CE. As explained in Section 2.4, this index quantifies to what extent the scenarios selected from a training dataset are able to reproduce a dataset used to validate the clustering model. An analysis on a simplified case was first carried out to test the reliability of the algorithms developed. To this end, one year of Hs data were

Final remarks

The selection of significant scenarios of geo-physical variables can be helpful for a plethora of applications. As far as met-ocean forcing is concerned, the hydrodynamic modelling of sea states might not be feasible if long-term series of data are meant to be investigated, mainly due to the computational cost requirements. To overcome such difficulties, it might be convenient to identify a reduced number of representative scenarios through clustering techniques. In this respect, this paper

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This research was developed in the framework of the Interreg Italia-Francia Marittimo projects SPlasH!1 (Stop alle Plastiche in H20!, grant number D31I18000620007), GEREMIA2 (GEstione dei REflui per il MIglioramento delle Acque portuali, grant number D41I18000600005) and SINAPSI3 (asSIstenza alla Navigazione per l’Accesso ai Porti in Sicurezza, grant number D64I18000160007) .

References (59)

  • KhelilN. et al.

    Challenges and opportunities in promoting integrated coastal zone management in Algeria: Demonstration from the algiers coast

    Ocean Coast. Manage.

    (2019)
  • LangM. et al.

    Towards operational guidelines for over-threshold modeling

    J. Hydrol.

    (1999)
  • LópezI. et al.

    Artificial neural networks applied to port operability assessment

    Ocean Eng.

    (2015)
  • LosadaI.J. et al.

    A planning strategy for the adaptation of coastal areas to climate change: The spanish case

    Ocean Coast. Manage.

    (2019)
  • MahendraR. et al.

    Assessment and management of coastal multi-hazard vulnerability along the Cuddalore–Villupuram, east coast of India using geospatial techniques

    Ocean Coast. Manage.

    (2011)
  • McLaughlinC. et al.

    Rivers, runoff, and reefs

    Glob. Planet. Change

    (2003)
  • MentaschiL. et al.

    Performance evaluation of wavewatch III in the Mediterranean Sea

    Ocean Model.

    (2015)
  • NashJ.E. et al.

    River flow forecasting through conceptual models part I—A discussion of principles

    J. Hydrol.

    (1970)
  • NúñezP. et al.

    A methodology to assess the probability of marine litter accumulation in estuaries

    Mar. Pollut. Bull.

    (2019)
  • RoelvinkD. et al.

    Modelling storm impacts on beaches, dunes and barrier islands

    Coast. Eng.

    (2009)
  • SilvaS.F. et al.

    An index-based method for coastal-flood risk assessment in low-lying areas (costa de caparica, Portugal)

    Ocean Coast. Manage.

    (2017)
  • VenancioK.K. et al.

    Hydrodynamic modeling with scenario approach in the evaluation of dredging impacts on coastal erosion in santos (Brazil)

    Ocean Coast. Manage.

    (2020)
  • WuG. et al.

    Modeling wave effects on storm surge and coastal inundation

    Coast. Eng.

    (2018)
  • ZhengW. et al.

    Beach management strategy for small islands: Case studies of China

    Ocean Coast. Manage.

    (2020)
  • AnderbergM.R.

    Cluster Analysis for Applications: Probability and Mathematical Statistics: A Series of Monographs and Textbooks, Vol. 19

    (2014)
  • BarbakhW.A. et al.

    Review of clustering algorithms

  • BooijN. et al.

    The “swan” wave model for shallow water

  • BrigantiR. et al.

    Large scale tests on foreshore evolution during storm sequences and the performance of a nearly vertical structure

    Coastal Eng. Proc.

    (2018)
  • BudillonF. et al.

    Sediment transport and deposition during extreme sea storm events at the Salerno Bay (Tyrrhenian Sea): comparison of field data with numerical model results

    Nat. Hazards Earth Syst. Sci.

    (2006)
  • Cited by (3)

    View full text