Zonation and scaling of tropical cyclone hazards based on spatial clustering for coastal China

Fang, Weihua; Zhang, Haixia

doi:10.1007/s11069-021-04878-4

Zonation and scaling of tropical cyclone hazards based on spatial clustering for coastal China

Original Paper
Published: 02 July 2021

Volume 109, pages 1271–1295, (2021)
Cite this article

Natural Hazards Aims and scope Submit manuscript

362 Accesses
2 Citations
Explore all metrics

Abstract

Zonation refers to the spatially constrained clustering of objects of interest with location information based on the similarity of their attributes. The results of zonation by clustering are usually relatively homogeneous spatial units in raster or vector formats. The spatial distribution of tropical cyclone (TC) hazards, such as TC wind and rainfall, may result in significant spatial heterogeneity from coastal to inland areas, and proper spatial zonation can greatly improve the understanding and management of TC risks. Although zonation methods have been developed based on expert knowledge, simple statistics or GIS tools in past studies, various challenges still exist in the areas of selecting representative attribute indicators, clustering algorithms, and fusion of multiple indicators into an integrated scaling indicator. In this study, TC hazards are chosen to explore methods for zonation and scaling. First, wind data of 1,256 TCs from 1949 to 2017 and rainfall data of 895 TCs from 1951 to 2014 were collected at a 1-km resolution. The mean, standard deviation, and intensity of the 200-year return period for wind and rainfall were estimated and used as representative hazard intensity indicators (HIIs) for spatial clustering. Second, the K-means, interactive self-organizing data analysis techniques algorithm, mean shift and Gaussian mixture model were used to test the suitability of natural hazard zonation based on raster data. All four algorithms were found to perform well, with K-means ranking the best. Third, a hierarchical clustering algorithm was utilized to cluster the HIIs into polygons at the provincial, city and county levels in China. Finally, the six HIIs were weighted into a single indicator for integrated hazard intensity scaling. The zonation and scaling maps developed in the present study can reflect the spatial pattern of TC hazard intensity satisfactorily. In general, the TC hazard scale is decreasing from the southeast coast to the northwest inland of China. The methods and steps proposed in this study can also be applied in the zonation and scaling of other types of disasters as well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

An overview to flood vulnerability assessment methods

Article Open access 26 April 2016

Hajar Nasiri, Mohd Johari Mohd Yusof & Thamer Ahmad Mohammad Ali

Disaster Risk Science: A Geographical Perspective and a Research Framework

Article Open access 21 August 2020

Peijun Shi, Tao Ye, … Norio Okada

An integration of geospatial and fuzzy-logic techniques for flood-hazard mapping

Article 10 April 2024

Mausmi Gohil, Darshan Mehta & Mohamedmaroof Shaikh

References

Ball GH, Hall DJ (1965) ISODATA a novel method of data analysis and pattern classification. Stanford Research Institute, Menlo Park
Google Scholar
Bera A, Mukhopadhyay BP, Das D (2019) Landslide hazard zonation mapping using multi-criteria analysis with the help of GIS techniques: a case study from Eastern Himalayas, Namchi, South Sikkim. Nat Hazards 96:935–959. https://doi.org/10.1007/s11069-019-03580-w
Article Google Scholar
Burnham KP, Anderson DR (2004) Multimodel inference: understanding AIC and BIC in model selection. Sociol Methods Res 33:261–304. https://doi.org/10.1177/0049124104268644
Article Google Scholar
Chen L, Meng Z (2001) An overview on tropical cyclone research progress in China during the past ten years. Chinese J Atmos Sci 25:420–432. https://doi.org/10.3878/j.issn.1006-9895.2001.03.11(in Chinese)
Article Google Scholar
Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24:603–619. https://doi.org/10.1109/34.1000236
Article Google Scholar
Cowpertwait PSP (2011) A regionalization method based on a cluster probability model. Water Resour Res 47:W11525. https://doi.org/10.1029/2011WR011084
Article Google Scholar
Dela Cerna MA, Maravillas EA (2016) An application of partitive clustering algorithm for landslide hazard zonation. Lect Notes Eng Comput Sci 1:25–30
Google Scholar
Diebolt J, Garrido M, Girard S (2007) A goodness-of-fit test for the distribution tail. In: Topics in extreme Values. Nova Science, New York, pp 95–110
Gordon AD (1987) A review of hierarchical classification. J R Stat Soc 150:119–137. https://doi.org/10.2307/2981629
Article Google Scholar
Grossi P, Kunreuther H (2005) Catastrophe modeling: a new approach to managing risk, 1st edn. Springer, Boston, MA. https://doi.org/10.1007/b1000669
Gupta RP, Kanungo DP, Arora MK, Sarkar S (2008) Approaches for comparative evaluation of raster GIS-based landslide susceptibility zonation maps. Int J Appl Earth Obs Geoinf 10:330–341. https://doi.org/10.1016/j.jag.2008.01.003
Article Google Scholar
Jenks GF (1967) The Data Model Concept in Statistical Mapping. Int Yearb Cartogr 7:186–190
Google Scholar
Jorion P (1996) Risk2: measuring the risk in value at risk. Financ Anal J 52:47–56. https://doi.org/10.2469/faj.v52.n6.2039
Article Google Scholar
Kazakis N, Kougias I, Patsialis T (2015) Assessment of flood hazard areas at a regional scale using an index-based approach and analytical hierarchy process: application in Rhodope-Evros region, Greece. Sci Total Environ 538:555–563. https://doi.org/10.1016/j.scitotenv.2015.08.055
Article Google Scholar
Koivunen AC, Kostinski AB (1999) The feasibility of data whitening to improve performance of weather radar. J Appl Meteorol 38:741–749. https://doi.org/10.1175/1520-0450(1999)038%3c0741:TFODWT%3e2.0.CO;2
Article Google Scholar
Li Y, Fang W (2014) Estimation on return period of tropical cyclone precipitation. J Nat Disasters 23:58–69. https://doi.org/10.13577/j.jnd.2014.0607
Article Google Scholar
Li Y, Fang W, Duan X (2019) On the driving forces of historical changes in the fatalities of tropical cyclone disasters in China from 1951 to 2014. Nat Hazards 98:507–533. https://doi.org/10.1007/s11069-019-03712-2
Article Google Scholar
Liu Y, Li Z, Xiong H et al (2013) Understanding and enhancement of internal clustering validation measures. IEEE Trans Cybern 43:982–994. https://doi.org/10.1109/TSMCB.2012.2220543
Article Google Scholar
Mansouri Daneshvar MR (2014) Landslide susceptibility zonation using analytical hierarchy process and GIS for the Bojnurd region, northeast of Iran. Landslides 11:1079–1091. https://doi.org/10.1007/s10346-013-0458-5
Article Google Scholar
Mantovani F, Soeters R, Van Westen CJ (1996) Remote sensing techniques for landslide studies and hazard zonation in Europe. Geomorphology 15:213–225. https://doi.org/10.1016/0169-555x(95)00071-c
Article Google Scholar
Markowitz H (1952) Portfolio selection. J Financ 7:77–91. https://doi.org/10.2307/2975974
Article Google Scholar
Mitchell VL (1976) The regionalization of climate in the western United States. J Appl Meteorol 15:920–927. https://doi.org/10.1175/1520-0450(1976)015%3c0920:TROCIT%3e2.0.CO;2
Article Google Scholar
Nguyen HV, Bai L, Linlin S (2009) Local gabor binary pattern whitened PCA: a novel approach for face recognition from single image per person. In: Advances in biometrics. Springer, Berlin, Heidelberg, pp 269–278. https://doi.org/10.1007/978-3-642-01793-3_28
Nojarov P (2017) Genetic climatic regionalization of the Balkan Peninsula using cluster analysis. J Geogr Sci 27:43–61. https://doi.org/10.1007/s11442-017-1363-y
Article Google Scholar
Rehman K, Burton PW, Weatherill GA (2014) K-means cluster analysis and seismicity partitioning for Pakistan. J Seismol 18:401–419. https://doi.org/10.1007/s10950-013-9415-y
Article Google Scholar
Santosh D, Venkatesh P, Poornesh P (2013) Tracking multiple moving objects using Gaussian mixture model. Int J Soft Comput Eng 3:114–119
Google Scholar
Scitovski S (2018) A density-based clustering algorithm for earthquake zoning. Comput Geosci 110:90–95. https://doi.org/10.1016/j.cageo.2017.08.014
Article Google Scholar
Shultz JM, Russell J, Espinel Z (2005) Epidemiology of tropical cyclones: The dynamics of disaster, disease, and development. Epidemiol Rev 27:21–35. https://doi.org/10.1093/epirev/mxi011
Article Google Scholar
Tan C, Fang W (2018) Mapping the wind hazard of global tropical cyclones with parametric wind field models by considering the effects of local factors. Int J Disaster Risk Sci 9:86–99. https://doi.org/10.1007/s13753-018-0161-1
Article Google Scholar
Toro GR, Niedoroda AW, Reed CW, Divoky D (2010) Quadrature-based approach for the efficient evaluation of surge hazard. Ocean Eng 37:114–124. https://doi.org/10.1016/j.oceaneng.2009.09.005
Article Google Scholar
Van Westen CJ, Rengers N, Terlien MTJ, Soeters R (1997) Prediction of the occurrence of slope instability phenomena through GIS-based hazard zonation. Geol Rundschau 86:404–414. https://doi.org/10.1007/s005310050149
Article Google Scholar
Wagenaar DJ, De Bruijn KM, Bouwer LM, De Moel H (2016) Uncertainty in flood damage estimates and its potential effect on investment decisions. Nat Hazards Earth Syst Sci 16:1–14. https://doi.org/10.5194/nhess-16-1-2016
Article Google Scholar
Xu W, Jiang H, Kang X (2014) Rainfall asymmetries of tropical cyclones prior to, during, and after making landfall in South China and Southeast United States. Atmos Res 139:18–26. https://doi.org/10.1016/j.atmosres.2013.12.015
Article Google Scholar
Ye Y, Fang W (2018) Estimation of the compound hazard severity of tropical cyclones over coastal China during 1949–2011 with copula function. Nat Hazards 93:887–903. https://doi.org/10.1007/s11069-018-3329-5
Article Google Scholar
Ying M, Zhang W, Yu H et al (2014) An overview of the China meteorological administration tropical cyclone database. J Atmos Ocean Technol 31:287–301. https://doi.org/10.1175/JTECH-D-12-00119.1
Article Google Scholar
Zhao S (1983) A new scheme for comprehensive physical regionalization in China. Acta Geogr Sin 31:1–10. https://doi.org/10.11821/xb198301001(in Chinese)
Article Google Scholar
Zheng D, Ge Q, Zhang X et al (2005) Regionalization in China: retrospect and prospect. Geogr Res 20:330–344. https://doi.org/10.3321/j.issn:1000-0585.2005.03.002(in Chinese)
Article Google Scholar

Download references

Acknowledgements

This work is mainly supported by the National Key Research and Development Program of China (No. 2018YFC1508803), the National Key Research and Development Program of China (No. 2017YFA0604903), and the Key Special Project for Introduced Talents Team of Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou) (No. GML2019ZD0601).

Author information

Authors and Affiliations

Key Laboratory of Environmental Change and Natural Disaster of Ministry of Education, Faculty of Geographical Science, Beijing Normal University, Beijing, 100875, China
Weihua Fang & Haixia Zhang
Academy of Disaster Risk Science, Faculty of Geographical Science, Beijing Normal University, Beijing, 100875, China
Weihua Fang & Haixia Zhang
Southern Marine Science and Engineering Guangdong Laboratory, Guangzhou, 511458, China
Weihua Fang

Authors

Weihua Fang
View author publications
You can also search for this author in PubMed Google Scholar
Haixia Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Weihua Fang.

Ethics declarations

Conflict of interest

The authors declare that they have no competing financial interests or personal relationships in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

1.1 A. PCA Whitening

The specific steps are as follows:

(1)
The three-dimensional matrix ${Y}_{d\times m\times n}$ composed of the six HIIs was reshaped to the two-dimensional matrix ${X}_{s\times d}$. Each row of the reshaped data is the observation of a grid, and the column is the indicator of all grids. Where $d$ is the number of indicators, $m$ represents the rows of geospatial data, $n$ represents the columns of geospatial data, and $s=m\times n$ represents the total number of grids of geospatial data.
(2)
The covariance matrix ${\Sigma }_{d\times d}$ of the two-dimensional matrix ${X}_{s\times d}$ was calculated, and the eigenvectors and eigenvalues of the covariance matrix ${\Sigma }_{d\times d}$ were calculated by the eigenvalue decomposition (EVD).
(3)
The product ${X}_{s\times d}^{\prime}$ of the data matrix ${X}_{s\times d}$ and the eigenvector matrix ${U}_{d\,\times\,d}$ was calculated to remove the correlation between features, ${X}_{s\,\times\,d}^{\prime}={X}_{s\,\times\,d}*{U}_{d\,\times\,d}$.
(4)
Each dimension of ${X}_{s\,\times\,d}^{\prime}$ was scaled by Eq. 3 to obtain a matrix $X_{{whiten, i}}^{\prime\prime } $ in which the variance of each dimension is unit variance.
$$ X_{{whiten,\,i}}^{\prime\prime} = \frac{{X_{i} }}{{std(X_{i} )}}\quad \,\left( {i = 1,2, \ldots ,6} \right)$$
(3)
where $X_{i}^{\prime}$ is all samples in column $i$ of matrix $X_{{s \times d}}^{\prime}$, $std\left( {X_{i}^{\prime} } \right)$ is the standard deviation of all samples in column $i$ of matrix $X_{{s \times d}}^{\prime}$, $X_{{whiten,\,i}}^{{''}}$ is all samples in column $i$ after PCA whitening.

1.2 B. K-means

The specific algorithm steps are as follows:

(1)
According to prior experience, the appropriate cluster number $K$ and the maximum iteration number $T$ are set.
(2)
Randomly select $K$ points from the data set $ F = \{ F\left( {x_{1} } \right),F\left( {x_{2} } \right),~ \ldots ,~F\left( {x_{n} } \right)\}$ as the initial center. The center vector of each cluster is $\left\{ {\mu ^{1} ,~\mu ^{2} ,~ \ldots ,\mu ^{K} } \right\} $, $n$ represents the number of samples, $ F = \left\{ {F_{1} \left( x \right)} \right.,F_{2} \left( x \right),~ \ldots ,~F_{m} \left( x \right)\}$ represents the $m$ feature statistics at point $x$, and $m$ represents the vector feature dimension.
(3)
For $ t = 1,~2,~ \ldots ,~T$, clusters are divided according to the following steps.
1. (a)
  Initialize the clusters to ${ C_{k} = \emptyset ,\left( {k = 1,2, \ldots ,{\mkern 1mu} K} \right)} $;
2. (b)
  According to the Euclidean distance formula, calculate the distance $d_{{ik}} $ between each sample $\left\{F\left({x}_{i}\right),\left(i=1,\,2,\,\ldots ,n\right)\right\}$ and each centroid vector ${\{\mu }^{k},(k=1,\,2,\,\ldots ,\,K)\}$, mark ${x}_{i}$ as the cluster ${\varphi }_{i}$ corresponding to ${d}_{ik}$ with the smallest distance, and update ${C}_{{\varphi }_{i}}={C}_{{\varphi }_{i}}\cup \left\{{x}_{i}\right\}$;
3. (c)
  For $k=1,\,2,\,\ldots ,\,K$, calculate the centroids ${\mu }^{k}=\frac{1}{\left|{C}_{k}\right|}\sum _{x\in {C}_{k}}F(x)$ for all samples in ${C}_{k}$;
4. (d)
  If the distance between the new center and the original center is less than the threshold, go to step 4; otherwise, repeat Step 3.
(4)
Output the clusters $C=\left\{{C}_{1},{C}_{2},\ldots {,C}_{K}\right\}$.

1.3 C. ISODATA

The specific algorithm steps are as follows:

(1)
Randomly select ${K}_{0}$ points from the data set $F=\{F\left({x}_{1}\right),F\left({x}_{2}\right),\,\ldots ,\,F\left({x}_{n}\right)\}$ as the initial center. The center vector of each cluster is $\left\{{\mu }^{1},\,{\mu }^{2},\,{\dots ,\mu }^{{K}_{0}}\right\}$, $n$ represents the number of samples, $F=\{{F}_{1}\left(x\right),{F}_{2}\left(x\right),\,\ldots ,\,{F}_{m}\left(x\right)\}$ represents the $m$ feature statistics at point $x$, and $m$ represents the vector feature dimension.
(2)
Calculate the distance ${d}_{ik}$ between each sample $\left\{F\left({x}_{i}\right),\left(i=1,\,2,\,\ldots ,n\right)\right\}$ and each centroid vector ${\{\mu }^{k},(k=1,\,2,\,\ldots ,\,{K}_{0})\}$, and mark ${x}_{i}$ as the cluster corresponding to ${d}_{ik}$ with the smallest distance.
(3)
Determine whether the number of samples in each cluster is less than ${N}_{min}$. If less than ${N}_{min}$, the cluster is discarded. Set $K={K}_{0}-1$, and the samples in this cluster are redistributed to the cluster with the smallest distance among the remaining clusters.
(4)
For $k=1,\,2,\,\ldots ,\,K$, calculate the centroids ${\mu }^{k}=\frac{1}{\left|{C}_{k}\right|}\sum _{x\in {C}_{k}}F(x)$ for all samples in ${C}_{k}$.
(5)
If $K\ge 2{K}_{0}$, there are too many clusters; go to the merge operation Step 8.
(6)
If $K\le \frac{{K}_{0}}{2}$, there are too few clusters; go to the splitting operation Step 9.
(7)
If the maximum number of iterations is reached, the algorithm is terminated. Otherwise, return to Step 2.
(8)
Merge operation:
1. (a)
  The distances between cluster centers of all clusters are calculated and expressed by matrix $D$, where $D\left(k,k\right)=0$;
2. (b)
  Two clusters of $D\left(k,{k}^{{\prime}}\right)<{d}_{min}\left(k\ne {k}^{{\prime}}\right)$ should be merged into a new cluster, and the center of this class is ${\mu }^{new}=\frac{1}{{n}_{k}+{n}_{{k}^{{\prime}}}}\left({n}_{k}{\mu }^{k}+{n}_{{k}^{{\prime}}}{\mu }^{{k}^{{\prime}}}\right)$, where ${n}_{i}$ and ${n}_{j}$ are the numbers of samples in the two clusters, respectively.
(9)
Splitting operation:
1. (a)
  Calculate the variance of each dimension of all samples in each cluster and select the maximum variance ${{\Sigma}}_{\mathrm{m}\mathrm{a}\mathrm{x}}$ of each cluster;
2. (b)
  If ${\sigma }_{max}>Sigma$ of a cluster and the sample size of this cluster is ${n}_{k}\ge 2{n}_{min}$, then go to Step c; otherwise, exit the splitting operation;
3. (c)
  This cluster is divided into two clusters, and set $K={K}_{0}+1$, ${{(\mu }^{k})}^{(+)}={\mu }^{k}+{\sigma }_{max}$, ${{(\mu }^{k})}^{(-)}={\mu }^{k}-{\sigma }_{max}$.

1.4 D. Mean shift

The specific algorithm steps are as follows:

(1)
Randomly select a sample as the core sample among the unmarked data samples.
(2)
Find all the samples with the core sample as the center and the bandwidth as the radius, record the sample as set $M$, and consider this sample to belong to cluster $C$. Add 1 to the probability that the samples in the circle belong to this class; this parameter will be used for step 7.
(3)
Take the core sample as the center, calculate the vector from the core sample to each sample in the set $M$, and add these vectors to obtain the vector offset value.
(4)
The core sample moves along the direction of the vector offset value, and the moving distance is ||shift||.
(5)
Repeat steps 2 through 4 until the vector offset value is very small, that is, iterate to convergence, the samples encountered in the iteration process are classified into cluster $C$, and record the core points at this time.
(6)
If the distance between the core sample of the current cluster $C$ and the core sample of other existing clusters ${C}_{i}$ is less than the threshold when converging, cluster ${C}_{i}$ and cluster $C$ are merged. Otherwise, take $C$ as a new cluster and add 1 cluster.
(7)
Repeat step 1 through step 6 until all points are traversed.
(8)
According to the access frequency of each cluster to each point, the cluster with the highest access frequency is taken as the cluster of the point.

1.5 E. Gaussian mixture model

(1)
Input data set $F=\{F\left({x}_{1}\right),F\left({x}_{2}\right),\,\ldots ,\,F\left({x}_{n}\right)\}$ and the number of Gaussian mixture models $K$, where $n$ is the number of samples, $F=\{{F}_{1}\left(x\right),{F}_{2}\left(x\right),\,\ldots ,\,{F}_{m}\left(x\right)\}$ is the $m$ feature statistics at point $x$, and $m$ is the vector feature dimension.
(2)
Assuming that each sample $\left\{F\left({x}_{i}\right),\left(i=1,\,2,\,\ldots ,n\right)\right\}$ is independent and identically distributed and that the probability density function of the Gaussian mixture model is${p}_{GMM}\left(F\left({x}_{i}\right);\theta \right)=\sum _{k=1}^{K}{\alpha }^{k}p\left(F({{x}_{i});\mu }^{k},{\Sigma }^{k}\right)$, $\theta =\{{\alpha }^{1}\dots {\alpha }^{K},{\mu }^{1}\dots {\mu }^{K},{{\Sigma}}^{1},\dots {{\Sigma}}^{K}\}$ is the set of all parameters, where ${\alpha }^{k}$ is the weight of the $k$ Gaussian distribution, which satisfies the condition${\sum }_{k=1}^{K}{\alpha }^{k}=1$. $p$ is the probability density function of the normal distribution, $\mu ^{k} = \left( {\mu _{1}^{k} , \ldots ,\mu _{d}^{k} } \right)^{\prime} $ is the mean of each dimension of the $k$ Gaussian distribution, and ${{\Sigma}}^{k}$ is the $d \times d$ covariance matrix.
(3)
When $i = 1,{\text{~}}2,{\text{~}} \ldots ,{\text{~}}m$, calculate the posterior probability $\gamma _{{ik}} = p_{M} \left( {\left. {z_{i} = k} \right|x_{i} } \right)\left( {1 \le k \le K} \right)$ of $x_{i}$ generated by each mixed component according to the following formula.
$$ p_{M} \left( {\left. {z_{i} = k} \right|x_{i} } \right) = \frac{{P\left( {z_{i} = k} \right) \cdot p_{M} \left( {\left. {x_{i} } \right|z_{i} = k} \right)}}{{p_{M} \left( {x_{i} } \right)}} = \frac{{\alpha ^{k} \cdot p\left( {\left. {x_{i} } \right|\mu ^{k} ,\Sigma ^{k} } \right)}}{{\mathop \sum \nolimits_{{k = 1}}^{K} \alpha ^{k} \cdot p\left( {\left. {x_{i} } \right|\mu ^{k} ,\Sigma ^{k} } \right)}} $$
(4)
(4)
For $1 \le k \le K$
1. (a)
  Calculate the new mean vector $ \mu ^{{k\prime }} = \frac{{\sum\nolimits_{{i = 1}}^{m} {\gamma _{{ik}} } x_{i} }}{{\sum\nolimits_{{i = 1}}^{m} {\gamma _{{ik}} } }}$;
2. (b)
  Calculate the new covariance matrix $ \Sigma ^{{k\prime }} = {\text{ }}\frac{{\sum\nolimits_{{i = 1}}^{m} {\gamma _{{ik}} } {\text{ }}\left( {{\text{ }}x_{i} - \mu ^{{k\prime }} } \right)\left( {{\text{ }}x_{i} - \mu ^{{k\prime }} } \right)^{T} }}{{\sum\nolimits_{{i = 1}}^{m} {\gamma _{{ik}} } }}$;
3. (c)
  Calculate the probability of a new mixed component ${{\alpha }^{k}}^{{\prime}}=\frac{{\sum }_{i=1}^{m}{\gamma }_{ik}}{m}$.
(5)
Repeat step 4 to update the model parameter $\left\{\left.({\alpha }^{k},{\mu }^{k},{{\Sigma}}^{k})\right|1\le k\le K\right\}$ to $ (\alpha ^{{k\prime }} ,\mu ^{{k\prime }} ,\Sigma ^{{k\prime }} )|1 \le k \le K$.
(6)
Repeat steps 3 and 4 until the stop condition is met.
(7)
${C}_{k}=\varnothing ,(1\le k\le K)$, when $i=1,\,2,\,\ldots ,\,m$, determine the cluster of ${x}_{i}$ as ${\varphi }_{i}$ according to $ \varphi _{i} = \mathop {\arg ~\max }\limits_{{k \in \left\{ {1,~2, \ldots ,K} \right\}}} \gamma _{{ik}}$, and divide ${x}_{i}$ into the corresponding cluster ${C}_{{\varphi }_{i}}={C}_{{\varphi }_{i}}\cup \left\{{x}_{i}\right\}$.
(8)
Output the clusters $C=\left\{{C}_{1},{C}_{2},\ldots {,C}_{K}\right\}$.

1.6 F. Hierarchical clustering

The specific algorithm steps are as follows:

(1)
Input data set $ F = \{ F\left( {x_{1} } \right),F\left( {x_{2} } \right),~ \ldots ,~F\left( {x_{n} } \right)\}$, the clustering cluster link algorithm, the distance measurement function, and the number of clusters $K$, where $n$ represents the number of samples, $ F = \{ F_{1} \left( x \right),F_{2} \left( x \right),~ \ldots ,~F_{m} \left( x \right)\}$ represents the $m$ feature statistics at point $x$, and $m$ represents the vector feature dimension.
(2)
Initialize each sample as a cluster ${C}_{i}=\left\{F\left({x}_{i}\right),\left(i=1,\,2,\,\ldots ,n\right)\right\}$.
(3)
Calculate the distance between each pair of clusters according to the average link and Euclidean distance formula. The average link refers to the average distance between samples in one cluster and samples in another cluster. The formulas are as follows:
$$ d\left( {u,v} \right) = \mathop \sum \limits_{{ij}} \frac{{dist\left( {u_{i} ,v_{j} } \right)}}{{\left( {\left| u \right|{\text{*}}\left| v \right|} \right)}} $$
(5)
where $u$ and $v$ are two clusters, $\left| u \right|$ and $\left| v \right|$ are the number of samples in the two clusters, $i$ is any sample in cluster $u$, and $j$ is any sample in cluster $v$.
(4)
Find the two closest clusters and merge them.
(5)
Repeat steps 3 and 4 until all samples are agglomerated into one cluster.
(6)
Output the clusters $C=\left\{{C}_{1},{C}_{2},\ldots {,C}_{K}\right\}$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fang, W., Zhang, H. Zonation and scaling of tropical cyclone hazards based on spatial clustering for coastal China. Nat Hazards 109, 1271–1295 (2021). https://doi.org/10.1007/s11069-021-04878-4

Download citation

Received: 15 November 2020
Accepted: 15 June 2021
Published: 02 July 2021
Issue Date: October 2021
DOI: https://doi.org/10.1007/s11069-021-04878-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Zonation and scaling of tropical cyclone hazards based on spatial clustering for coastal China

Abstract

Access this article

Similar content being viewed by others

An overview to flood vulnerability assessment methods

Disaster Risk Science: A Geographical Perspective and a Research Framework

An integration of geospatial and fuzzy-logic techniques for flood-hazard mapping

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

1.1 A. PCA Whitening

1.2 B. K-means

1.3 C. ISODATA

1.4 D. Mean shift

1.5 E. Gaussian mixture model

1.6 F. Hierarchical clustering

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Zonation and scaling of tropical cyclone hazards based on spatial clustering for coastal China

Abstract

Access this article

Similar content being viewed by others

An overview to flood vulnerability assessment methods

Disaster Risk Science: A Geographical Perspective and a Research Framework

An integration of geospatial and fuzzy-logic techniques for flood-hazard mapping

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

1.1 A. PCA Whitening

1.2 B. K-means

1.3 C. ISODATA

1.4 D. Mean shift

1.5 E. Gaussian mixture model

1.6 F. Hierarchical clustering

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation