Abstract
Zonation refers to the spatially constrained clustering of objects of interest with location information based on the similarity of their attributes. The results of zonation by clustering are usually relatively homogeneous spatial units in raster or vector formats. The spatial distribution of tropical cyclone (TC) hazards, such as TC wind and rainfall, may result in significant spatial heterogeneity from coastal to inland areas, and proper spatial zonation can greatly improve the understanding and management of TC risks. Although zonation methods have been developed based on expert knowledge, simple statistics or GIS tools in past studies, various challenges still exist in the areas of selecting representative attribute indicators, clustering algorithms, and fusion of multiple indicators into an integrated scaling indicator. In this study, TC hazards are chosen to explore methods for zonation and scaling. First, wind data of 1,256 TCs from 1949 to 2017 and rainfall data of 895 TCs from 1951 to 2014 were collected at a 1-km resolution. The mean, standard deviation, and intensity of the 200-year return period for wind and rainfall were estimated and used as representative hazard intensity indicators (HIIs) for spatial clustering. Second, the K-means, interactive self-organizing data analysis techniques algorithm, mean shift and Gaussian mixture model were used to test the suitability of natural hazard zonation based on raster data. All four algorithms were found to perform well, with K-means ranking the best. Third, a hierarchical clustering algorithm was utilized to cluster the HIIs into polygons at the provincial, city and county levels in China. Finally, the six HIIs were weighted into a single indicator for integrated hazard intensity scaling. The zonation and scaling maps developed in the present study can reflect the spatial pattern of TC hazard intensity satisfactorily. In general, the TC hazard scale is decreasing from the southeast coast to the northwest inland of China. The methods and steps proposed in this study can also be applied in the zonation and scaling of other types of disasters as well.
Similar content being viewed by others
References
Ball GH, Hall DJ (1965) ISODATA a novel method of data analysis and pattern classification. Stanford Research Institute, Menlo Park
Bera A, Mukhopadhyay BP, Das D (2019) Landslide hazard zonation mapping using multi-criteria analysis with the help of GIS techniques: a case study from Eastern Himalayas, Namchi, South Sikkim. Nat Hazards 96:935–959. https://doi.org/10.1007/s11069-019-03580-w
Burnham KP, Anderson DR (2004) Multimodel inference: understanding AIC and BIC in model selection. Sociol Methods Res 33:261–304. https://doi.org/10.1177/0049124104268644
Chen L, Meng Z (2001) An overview on tropical cyclone research progress in China during the past ten years. Chinese J Atmos Sci 25:420–432. https://doi.org/10.3878/j.issn.1006-9895.2001.03.11(in Chinese)
Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24:603–619. https://doi.org/10.1109/34.1000236
Cowpertwait PSP (2011) A regionalization method based on a cluster probability model. Water Resour Res 47:W11525. https://doi.org/10.1029/2011WR011084
Dela Cerna MA, Maravillas EA (2016) An application of partitive clustering algorithm for landslide hazard zonation. Lect Notes Eng Comput Sci 1:25–30
Diebolt J, Garrido M, Girard S (2007) A goodness-of-fit test for the distribution tail. In: Topics in extreme Values. Nova Science, New York, pp 95–110
Gordon AD (1987) A review of hierarchical classification. J R Stat Soc 150:119–137. https://doi.org/10.2307/2981629
Grossi P, Kunreuther H (2005) Catastrophe modeling: a new approach to managing risk, 1st edn. Springer, Boston, MA. https://doi.org/10.1007/b1000669
Gupta RP, Kanungo DP, Arora MK, Sarkar S (2008) Approaches for comparative evaluation of raster GIS-based landslide susceptibility zonation maps. Int J Appl Earth Obs Geoinf 10:330–341. https://doi.org/10.1016/j.jag.2008.01.003
Jenks GF (1967) The Data Model Concept in Statistical Mapping. Int Yearb Cartogr 7:186–190
Jorion P (1996) Risk2: measuring the risk in value at risk. Financ Anal J 52:47–56. https://doi.org/10.2469/faj.v52.n6.2039
Kazakis N, Kougias I, Patsialis T (2015) Assessment of flood hazard areas at a regional scale using an index-based approach and analytical hierarchy process: application in Rhodope-Evros region, Greece. Sci Total Environ 538:555–563. https://doi.org/10.1016/j.scitotenv.2015.08.055
Koivunen AC, Kostinski AB (1999) The feasibility of data whitening to improve performance of weather radar. J Appl Meteorol 38:741–749. https://doi.org/10.1175/1520-0450(1999)038%3c0741:TFODWT%3e2.0.CO;2
Li Y, Fang W (2014) Estimation on return period of tropical cyclone precipitation. J Nat Disasters 23:58–69. https://doi.org/10.13577/j.jnd.2014.0607
Li Y, Fang W, Duan X (2019) On the driving forces of historical changes in the fatalities of tropical cyclone disasters in China from 1951 to 2014. Nat Hazards 98:507–533. https://doi.org/10.1007/s11069-019-03712-2
Liu Y, Li Z, Xiong H et al (2013) Understanding and enhancement of internal clustering validation measures. IEEE Trans Cybern 43:982–994. https://doi.org/10.1109/TSMCB.2012.2220543
Mansouri Daneshvar MR (2014) Landslide susceptibility zonation using analytical hierarchy process and GIS for the Bojnurd region, northeast of Iran. Landslides 11:1079–1091. https://doi.org/10.1007/s10346-013-0458-5
Mantovani F, Soeters R, Van Westen CJ (1996) Remote sensing techniques for landslide studies and hazard zonation in Europe. Geomorphology 15:213–225. https://doi.org/10.1016/0169-555x(95)00071-c
Markowitz H (1952) Portfolio selection. J Financ 7:77–91. https://doi.org/10.2307/2975974
Mitchell VL (1976) The regionalization of climate in the western United States. J Appl Meteorol 15:920–927. https://doi.org/10.1175/1520-0450(1976)015%3c0920:TROCIT%3e2.0.CO;2
Nguyen HV, Bai L, Linlin S (2009) Local gabor binary pattern whitened PCA: a novel approach for face recognition from single image per person. In: Advances in biometrics. Springer, Berlin, Heidelberg, pp 269–278. https://doi.org/10.1007/978-3-642-01793-3_28
Nojarov P (2017) Genetic climatic regionalization of the Balkan Peninsula using cluster analysis. J Geogr Sci 27:43–61. https://doi.org/10.1007/s11442-017-1363-y
Rehman K, Burton PW, Weatherill GA (2014) K-means cluster analysis and seismicity partitioning for Pakistan. J Seismol 18:401–419. https://doi.org/10.1007/s10950-013-9415-y
Santosh D, Venkatesh P, Poornesh P (2013) Tracking multiple moving objects using Gaussian mixture model. Int J Soft Comput Eng 3:114–119
Scitovski S (2018) A density-based clustering algorithm for earthquake zoning. Comput Geosci 110:90–95. https://doi.org/10.1016/j.cageo.2017.08.014
Shultz JM, Russell J, Espinel Z (2005) Epidemiology of tropical cyclones: The dynamics of disaster, disease, and development. Epidemiol Rev 27:21–35. https://doi.org/10.1093/epirev/mxi011
Tan C, Fang W (2018) Mapping the wind hazard of global tropical cyclones with parametric wind field models by considering the effects of local factors. Int J Disaster Risk Sci 9:86–99. https://doi.org/10.1007/s13753-018-0161-1
Toro GR, Niedoroda AW, Reed CW, Divoky D (2010) Quadrature-based approach for the efficient evaluation of surge hazard. Ocean Eng 37:114–124. https://doi.org/10.1016/j.oceaneng.2009.09.005
Van Westen CJ, Rengers N, Terlien MTJ, Soeters R (1997) Prediction of the occurrence of slope instability phenomena through GIS-based hazard zonation. Geol Rundschau 86:404–414. https://doi.org/10.1007/s005310050149
Wagenaar DJ, De Bruijn KM, Bouwer LM, De Moel H (2016) Uncertainty in flood damage estimates and its potential effect on investment decisions. Nat Hazards Earth Syst Sci 16:1–14. https://doi.org/10.5194/nhess-16-1-2016
Xu W, Jiang H, Kang X (2014) Rainfall asymmetries of tropical cyclones prior to, during, and after making landfall in South China and Southeast United States. Atmos Res 139:18–26. https://doi.org/10.1016/j.atmosres.2013.12.015
Ye Y, Fang W (2018) Estimation of the compound hazard severity of tropical cyclones over coastal China during 1949–2011 with copula function. Nat Hazards 93:887–903. https://doi.org/10.1007/s11069-018-3329-5
Ying M, Zhang W, Yu H et al (2014) An overview of the China meteorological administration tropical cyclone database. J Atmos Ocean Technol 31:287–301. https://doi.org/10.1175/JTECH-D-12-00119.1
Zhao S (1983) A new scheme for comprehensive physical regionalization in China. Acta Geogr Sin 31:1–10. https://doi.org/10.11821/xb198301001(in Chinese)
Zheng D, Ge Q, Zhang X et al (2005) Regionalization in China: retrospect and prospect. Geogr Res 20:330–344. https://doi.org/10.3321/j.issn:1000-0585.2005.03.002(in Chinese)
Acknowledgements
This work is mainly supported by the National Key Research and Development Program of China (No. 2018YFC1508803), the National Key Research and Development Program of China (No. 2017YFA0604903), and the Key Special Project for Introduced Talents Team of Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou) (No. GML2019ZD0601).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no competing financial interests or personal relationships in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 A. PCA Whitening
The specific steps are as follows:
-
(1)
The three-dimensional matrix \({Y}_{d\times m\times n}\) composed of the six HIIs was reshaped to the two-dimensional matrix \({X}_{s\times d}\). Each row of the reshaped data is the observation of a grid, and the column is the indicator of all grids. Where \(d\) is the number of indicators, \(m\) represents the rows of geospatial data, \(n\) represents the columns of geospatial data, and \(s=m\times n\) represents the total number of grids of geospatial data.
-
(2)
The covariance matrix \({\Sigma }_{d\times d}\) of the two-dimensional matrix \({X}_{s\times d}\) was calculated, and the eigenvectors and eigenvalues of the covariance matrix \({\Sigma }_{d\times d}\) were calculated by the eigenvalue decomposition (EVD).
-
(3)
The product \({X}_{s\times d}^{\prime}\) of the data matrix \({X}_{s\times d}\) and the eigenvector matrix \({U}_{d\,\times\,d}\) was calculated to remove the correlation between features, \({X}_{s\,\times\,d}^{\prime}={X}_{s\,\times\,d}*{U}_{d\,\times\,d}\).
-
(4)
Each dimension of \({X}_{s\,\times\,d}^{\prime}\) was scaled by Eq. 3 to obtain a matrix \(X_{{whiten, i}}^{\prime\prime } \) in which the variance of each dimension is unit variance.
$$ X_{{whiten,\,i}}^{\prime\prime} = \frac{{X_{i} }}{{std(X_{i} )}}\quad \,\left( {i = 1,2, \ldots ,6} \right)$$(3)where \(X_{i}^{\prime}\) is all samples in column \(i\) of matrix \(X_{{s \times d}}^{\prime}\), \(std\left( {X_{i}^{\prime} } \right)\) is the standard deviation of all samples in column \(i\) of matrix \(X_{{s \times d}}^{\prime}\), \(X_{{whiten,\,i}}^{{''}}\) is all samples in column \(i\) after PCA whitening.
1.2 B. K-means
The specific algorithm steps are as follows:
-
(1)
According to prior experience, the appropriate cluster number \(K\) and the maximum iteration number \(T\) are set.
-
(2)
Randomly select \(K\) points from the data set \( F = \{ F\left( {x_{1} } \right),F\left( {x_{2} } \right),~ \ldots ,~F\left( {x_{n} } \right)\}\) as the initial center. The center vector of each cluster is \(\left\{ {\mu ^{1} ,~\mu ^{2} ,~ \ldots ,\mu ^{K} } \right\} \), \(n\) represents the number of samples, \( F = \left\{ {F_{1} \left( x \right)} \right.,F_{2} \left( x \right),~ \ldots ,~F_{m} \left( x \right)\}\) represents the \(m\) feature statistics at point \(x\), and \(m\) represents the vector feature dimension.
-
(3)
For \( t = 1,~2,~ \ldots ,~T\), clusters are divided according to the following steps.
-
(a)
Initialize the clusters to \({ C_{k} = \emptyset ,\left( {k = 1,2, \ldots ,{\mkern 1mu} K} \right)} \);
-
(b)
According to the Euclidean distance formula, calculate the distance \(d_{{ik}} \) between each sample \(\left\{F\left({x}_{i}\right),\left(i=1,\,2,\,\ldots ,n\right)\right\}\) and each centroid vector \({\{\mu }^{k},(k=1,\,2,\,\ldots ,\,K)\}\), mark \({x}_{i}\) as the cluster \({\varphi }_{i}\) corresponding to \({d}_{ik}\) with the smallest distance, and update \({C}_{{\varphi }_{i}}={C}_{{\varphi }_{i}}\cup \left\{{x}_{i}\right\}\);
-
(c)
For \(k=1,\,2,\,\ldots ,\,K\), calculate the centroids \({\mu }^{k}=\frac{1}{\left|{C}_{k}\right|}\sum _{x\in {C}_{k}}F(x)\) for all samples in \({C}_{k}\);
-
(d)
If the distance between the new center and the original center is less than the threshold, go to step 4; otherwise, repeat Step 3.
-
(a)
-
(4)
Output the clusters \(C=\left\{{C}_{1},{C}_{2},\ldots {,C}_{K}\right\}\).
1.3 C. ISODATA
The specific algorithm steps are as follows:
-
(1)
Randomly select \({K}_{0}\) points from the data set \(F=\{F\left({x}_{1}\right),F\left({x}_{2}\right),\,\ldots ,\,F\left({x}_{n}\right)\}\) as the initial center. The center vector of each cluster is \(\left\{{\mu }^{1},\,{\mu }^{2},\,{\dots ,\mu }^{{K}_{0}}\right\}\), \(n\) represents the number of samples, \(F=\{{F}_{1}\left(x\right),{F}_{2}\left(x\right),\,\ldots ,\,{F}_{m}\left(x\right)\}\) represents the \(m\) feature statistics at point \(x\), and \(m\) represents the vector feature dimension.
-
(2)
Calculate the distance \({d}_{ik}\) between each sample \(\left\{F\left({x}_{i}\right),\left(i=1,\,2,\,\ldots ,n\right)\right\}\) and each centroid vector \({\{\mu }^{k},(k=1,\,2,\,\ldots ,\,{K}_{0})\}\), and mark \({x}_{i}\) as the cluster corresponding to \({d}_{ik}\) with the smallest distance.
-
(3)
Determine whether the number of samples in each cluster is less than \({N}_{min}\). If less than \({N}_{min}\), the cluster is discarded. Set \(K={K}_{0}-1\), and the samples in this cluster are redistributed to the cluster with the smallest distance among the remaining clusters.
-
(4)
For \(k=1,\,2,\,\ldots ,\,K\), calculate the centroids \({\mu }^{k}=\frac{1}{\left|{C}_{k}\right|}\sum _{x\in {C}_{k}}F(x)\) for all samples in \({C}_{k}\).
-
(5)
If \(K\ge 2{K}_{0}\), there are too many clusters; go to the merge operation Step 8.
-
(6)
If \(K\le \frac{{K}_{0}}{2}\), there are too few clusters; go to the splitting operation Step 9.
-
(7)
If the maximum number of iterations is reached, the algorithm is terminated. Otherwise, return to Step 2.
-
(8)
Merge operation:
-
(a)
The distances between cluster centers of all clusters are calculated and expressed by matrix \(D\), where \(D\left(k,k\right)=0\);
-
(b)
Two clusters of \(D\left(k,{k}^{{\prime}}\right)<{d}_{min}\left(k\ne {k}^{{\prime}}\right)\) should be merged into a new cluster, and the center of this class is \({\mu }^{new}=\frac{1}{{n}_{k}+{n}_{{k}^{{\prime}}}}\left({n}_{k}{\mu }^{k}+{n}_{{k}^{{\prime}}}{\mu }^{{k}^{{\prime}}}\right)\), where \({n}_{i}\) and \({n}_{j}\) are the numbers of samples in the two clusters, respectively.
-
(a)
-
(9)
Splitting operation:
-
(a)
Calculate the variance of each dimension of all samples in each cluster and select the maximum variance \({{\Sigma}}_{\mathrm{m}\mathrm{a}\mathrm{x}}\) of each cluster;
-
(b)
If \({\sigma }_{max}>Sigma\) of a cluster and the sample size of this cluster is \({n}_{k}\ge 2{n}_{min}\), then go to Step c; otherwise, exit the splitting operation;
-
(c)
This cluster is divided into two clusters, and set \(K={K}_{0}+1\), \({{(\mu }^{k})}^{(+)}={\mu }^{k}+{\sigma }_{max}\), \({{(\mu }^{k})}^{(-)}={\mu }^{k}-{\sigma }_{max}\).
-
(a)
1.4 D. Mean shift
The specific algorithm steps are as follows:
-
(1)
Randomly select a sample as the core sample among the unmarked data samples.
-
(2)
Find all the samples with the core sample as the center and the bandwidth as the radius, record the sample as set \(M\), and consider this sample to belong to cluster \(C\). Add 1 to the probability that the samples in the circle belong to this class; this parameter will be used for step 7.
-
(3)
Take the core sample as the center, calculate the vector from the core sample to each sample in the set \(M\), and add these vectors to obtain the vector offset value.
-
(4)
The core sample moves along the direction of the vector offset value, and the moving distance is ||shift||.
-
(5)
Repeat steps 2 through 4 until the vector offset value is very small, that is, iterate to convergence, the samples encountered in the iteration process are classified into cluster \(C\), and record the core points at this time.
-
(6)
If the distance between the core sample of the current cluster \(C\) and the core sample of other existing clusters \({C}_{i}\) is less than the threshold when converging, cluster \({C}_{i}\) and cluster \(C\) are merged. Otherwise, take \(C\) as a new cluster and add 1 cluster.
-
(7)
Repeat step 1 through step 6 until all points are traversed.
-
(8)
According to the access frequency of each cluster to each point, the cluster with the highest access frequency is taken as the cluster of the point.
1.5 E. Gaussian mixture model
-
(1)
Input data set \(F=\{F\left({x}_{1}\right),F\left({x}_{2}\right),\,\ldots ,\,F\left({x}_{n}\right)\}\) and the number of Gaussian mixture models \(K\), where \(n\) is the number of samples, \(F=\{{F}_{1}\left(x\right),{F}_{2}\left(x\right),\,\ldots ,\,{F}_{m}\left(x\right)\}\) is the \(m\) feature statistics at point \(x\), and \(m\) is the vector feature dimension.
-
(2)
Assuming that each sample \(\left\{F\left({x}_{i}\right),\left(i=1,\,2,\,\ldots ,n\right)\right\}\) is independent and identically distributed and that the probability density function of the Gaussian mixture model is\({p}_{GMM}\left(F\left({x}_{i}\right);\theta \right)=\sum _{k=1}^{K}{\alpha }^{k}p\left(F({{x}_{i});\mu }^{k},{\Sigma }^{k}\right)\), \(\theta =\{{\alpha }^{1}\dots {\alpha }^{K},{\mu }^{1}\dots {\mu }^{K},{{\Sigma}}^{1},\dots {{\Sigma}}^{K}\}\) is the set of all parameters, where \({\alpha }^{k}\) is the weight of the \(k\) Gaussian distribution, which satisfies the condition\({\sum }_{k=1}^{K}{\alpha }^{k}=1\). \(p\) is the probability density function of the normal distribution, \(\mu ^{k} = \left( {\mu _{1}^{k} , \ldots ,\mu _{d}^{k} } \right)^{\prime} \) is the mean of each dimension of the \(k\) Gaussian distribution, and \({{\Sigma}}^{k}\) is the \(d \times d\) covariance matrix.
-
(3)
When \(i = 1,{\text{~}}2,{\text{~}} \ldots ,{\text{~}}m\), calculate the posterior probability \(\gamma _{{ik}} = p_{M} \left( {\left. {z_{i} = k} \right|x_{i} } \right)\left( {1 \le k \le K} \right)\) of \(x_{i}\) generated by each mixed component according to the following formula.
$$ p_{M} \left( {\left. {z_{i} = k} \right|x_{i} } \right) = \frac{{P\left( {z_{i} = k} \right) \cdot p_{M} \left( {\left. {x_{i} } \right|z_{i} = k} \right)}}{{p_{M} \left( {x_{i} } \right)}} = \frac{{\alpha ^{k} \cdot p\left( {\left. {x_{i} } \right|\mu ^{k} ,\Sigma ^{k} } \right)}}{{\mathop \sum \nolimits_{{k = 1}}^{K} \alpha ^{k} \cdot p\left( {\left. {x_{i} } \right|\mu ^{k} ,\Sigma ^{k} } \right)}} $$(4) -
(4)
For \(1 \le k \le K\)
-
(a)
Calculate the new mean vector \( \mu ^{{k\prime }} = \frac{{\sum\nolimits_{{i = 1}}^{m} {\gamma _{{ik}} } x_{i} }}{{\sum\nolimits_{{i = 1}}^{m} {\gamma _{{ik}} } }}\);
-
(b)
Calculate the new covariance matrix \( \Sigma ^{{k\prime }} = {\text{ }}\frac{{\sum\nolimits_{{i = 1}}^{m} {\gamma _{{ik}} } {\text{ }}\left( {{\text{ }}x_{i} - \mu ^{{k\prime }} } \right)\left( {{\text{ }}x_{i} - \mu ^{{k\prime }} } \right)^{T} }}{{\sum\nolimits_{{i = 1}}^{m} {\gamma _{{ik}} } }}\);
-
(c)
Calculate the probability of a new mixed component \({{\alpha }^{k}}^{{\prime}}=\frac{{\sum }_{i=1}^{m}{\gamma }_{ik}}{m}\).
-
(a)
-
(5)
Repeat step 4 to update the model parameter \(\left\{\left.({\alpha }^{k},{\mu }^{k},{{\Sigma}}^{k})\right|1\le k\le K\right\}\) to \( (\alpha ^{{k\prime }} ,\mu ^{{k\prime }} ,\Sigma ^{{k\prime }} )|1 \le k \le K\).
-
(6)
Repeat steps 3 and 4 until the stop condition is met.
-
(7)
\({C}_{k}=\varnothing ,(1\le k\le K)\), when \(i=1,\,2,\,\ldots ,\,m\), determine the cluster of \({x}_{i}\) as \({\varphi }_{i}\) according to \( \varphi _{i} = \mathop {\arg ~\max }\limits_{{k \in \left\{ {1,~2, \ldots ,K} \right\}}} \gamma _{{ik}}\), and divide \({x}_{i}\) into the corresponding cluster \({C}_{{\varphi }_{i}}={C}_{{\varphi }_{i}}\cup \left\{{x}_{i}\right\}\).
-
(8)
Output the clusters \(C=\left\{{C}_{1},{C}_{2},\ldots {,C}_{K}\right\}\).
1.6 F. Hierarchical clustering
The specific algorithm steps are as follows:
-
(1)
Input data set \( F = \{ F\left( {x_{1} } \right),F\left( {x_{2} } \right),~ \ldots ,~F\left( {x_{n} } \right)\}\), the clustering cluster link algorithm, the distance measurement function, and the number of clusters \(K\), where \(n\) represents the number of samples, \( F = \{ F_{1} \left( x \right),F_{2} \left( x \right),~ \ldots ,~F_{m} \left( x \right)\}\) represents the \(m\) feature statistics at point \(x\), and \(m\) represents the vector feature dimension.
-
(2)
Initialize each sample as a cluster \({C}_{i}=\left\{F\left({x}_{i}\right),\left(i=1,\,2,\,\ldots ,n\right)\right\}\).
-
(3)
Calculate the distance between each pair of clusters according to the average link and Euclidean distance formula. The average link refers to the average distance between samples in one cluster and samples in another cluster. The formulas are as follows:
$$ d\left( {u,v} \right) = \mathop \sum \limits_{{ij}} \frac{{dist\left( {u_{i} ,v_{j} } \right)}}{{\left( {\left| u \right|{\text{*}}\left| v \right|} \right)}} $$(5)where \(u\) and \(v\) are two clusters, \(\left| u \right|\) and \(\left| v \right|\) are the number of samples in the two clusters, \(i\) is any sample in cluster \(u\), and \(j\) is any sample in cluster \(v\).
-
(4)
Find the two closest clusters and merge them.
-
(5)
Repeat steps 3 and 4 until all samples are agglomerated into one cluster.
-
(6)
Output the clusters \(C=\left\{{C}_{1},{C}_{2},\ldots {,C}_{K}\right\}\).
Rights and permissions
About this article
Cite this article
Fang, W., Zhang, H. Zonation and scaling of tropical cyclone hazards based on spatial clustering for coastal China. Nat Hazards 109, 1271–1295 (2021). https://doi.org/10.1007/s11069-021-04878-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11069-021-04878-4