Introduction

Solar energy is a renewable energy resource in nature, and it plays a major factor in between other alternative energy source. For any study of solar energy, the information of solar radiation at a given geographical location is very important (Bakirci 2009, 2015). There are many important radiation parameters such as global, diffuse and direct radiation used in solar energy techniques (Hussain et al. 1999). Solar energy in Algeria is available in abundant amounts across the year; the average duration of sunshine value is 3000 h/year. Also, the average energy is 1700 KW h/m2/year in the North and 2650 KW h/m2/year in the South (BoudgheneStambouli 2011; Bouchouicha et al. 2015). There are many studies carried out in the world for estimating diffuse solar radiation using the available data which include hours of solar radiation (Sabbagh et al. 1977; Iqbal 1979; Erbs et al. 1982; De Miguel et al. 2001; Paliatsos et al. 2003; Li et al. 2012).

The recent studies have used empirical models based on mathematical function to estimate the diffuse radiation using clearness index and sunshine ratio, thus playing an essential role in the absence of required technological installations. Many authors have used linear or nonlinear regression models and polynomial models to correlate diffusion coefficient or diffuse fraction with sunshine ratio and/or clearness index (Orgill and Hollands 1977; Spencer 1982; Reindl et al. 1990; Lam and Li 1996; Hua et al. 2002; Soares et al. 2004). Kuo et al. (2014) studied the data for global and diffuse radiation in Tainan, Taiwan, for two years; the proposed models are compared with the fourteen models available in the literature; it is concluded that the proposed piece-wise linear models perform well in predicting the diffuse fraction. Liu et al. (2017) developed four models using global solar radiation and sunshine duration data in China; the analysis of statistical indexes demonstrated that cubic models presented the best performance in radiation zone. For Algeria, Mecibah et al. (2014) proposed quadric and cubic models based on the sunshine-based models. Also, Bailek et al. (2017) reviewed and compared thirty-five proposed correlations to measured irradiation of Algerian Big South (Adrar region); it is concluded that the second-order polynomial model of diffuse fraction is able to estimate the monthly average daily diffuse irradiation on a horizontal surface.

The main objective of this study is to develop and compare different proposed empirical models for estimation of horizontal monthly mean diffuse solar radiation based on clearness index and sunshine ratio.

Methodology

Solar radiation data

Data of horizontal global solar radiation, diffuse solar radiation and sunshine period of Tamanrasset station were taken from National Meteorological Office of Algeria from 1995 to 2017. The geographical information of the station is given in Table 1. Tamanrasset is located in the southeastern region of Algeria (Fig. 1). Tamanrasset has a hot desert climate (Köppen climate classification BWh), with very hot summers and mild winters. There is very little rain throughout the year, although occasional rain does fall in late summer from the northern extension of the Intertropical Convergence Zone.

Table 1 Geographic and data records period of Tamanrasset
Fig. 1
figure 1

Location of Tamanrasset station

Proposed of models

In the current study, the regression analysis is used for the proposed models, where the predictand is the diffuse fraction (kd) or diffuse coefficient (KD) and the predictors are sunshine ratio (St) and clearness index (Kt). Thus, three types of forty models can be defined for e diffuse fraction and diffusion coefficient. The three respective types can be written as:

$$ \left( {k_{\text{d}} = \frac{{H_{\text{d}} }}{H},K_{\text{D}} = \frac{{H_{\text{d}} }}{{H_{0} }}} \right) \approx f(S_{t} ) $$
(1)
$$ \left( {k_{\text{d}} = \frac{{H_{\text{d}} }}{H},K_{\text{D}} = \frac{{H_{\text{d}} }}{{H_{0} }}} \right) \approx f(K_{t} ) $$
(2)
$$ \left( {k_{\text{d}} = \frac{{H_{\text{d}} }}{H},K_{\text{D}} = \frac{{H_{\text{d}} }}{{H_{0} }}} \right) \approx f(S_{t} ,K_{t} ) $$
(3)

where H0, H, and Hd are the monthly mean daily extraterrestrial solar radiation, global solar radiation and diffuse solar radiation on a horizontal surface, respectively. Mathematically, sunshine ratio and clearness index are defined as

$$ K_{t} = \frac{H}{{H_{0} }} $$
(4)
$$ S_{t} = \frac{S}{{S_{0} }} $$
(5)

where S and So are the sunshine duration and maximum possible sunshine durations, respectively.

The monthly average daily extraterrestrial solar radiation on a horizontal surface is calculated from the following equation Klein (1977):

$$ H_{0} = \frac{24}{\pi }H_{\text{sc}} \left( {1 + 0.033\cos \left( {\frac{360}{365}n} \right)} \right)\left( {\cos \varphi \cos \delta \sin \omega_{\text{s}} + \frac{\pi }{180}\omega_{\text{s}} \sin \varphi \sin \delta } \right) $$
(6)

where Hsc is the solar constant, n is the Julian day of the year, φ is the location latitude, and δ is declination angle, ωs is the sunset hour angle. δ and ωs are mathematically defined as:

$$ \delta = 23.45\sin \left[ {360\frac{{\left( {n + 284} \right)}}{365}} \right] $$
(7)
$$ \omega_{\text{s}} = \cos^{ - 1} \left[ { - \tan \varphi \tan \delta } \right] $$
(8)

The maximum possible sunshine duration (\( S_{0} \)) is calculated as:

$$ S_{0} \frac{2}{15}\omega_{\text{s}} $$
(9)

The forty proposed models for each diffuse fraction and diffusion coefficient are presented in Table 2.

Table 2 (a) Various forms of correlations for diffuse fraction and (b) various forms of correlations for diffusion coefficient

Statistical evaluation

In this study, nine statistical indicators were used to evaluate different proposed models such as mean bias error (MBE), mean absolute error (MAE), mean absolute relative error (MARE), mean absolute percentage error (MAPE), root mean squared error (RMSE), root mean squared relative error (RMSRE), relative root mean squared error (RRMSE), correlation coefficient (R2) and t-statistics (t-stat). Mathematical expressions for these indicators are defined as:

$$ {\text{MBE}} = \frac{1}{n}\sum\limits_{i = 1}^{n} {\left( {H_{i,o} - H_{i,m} } \right)} $$
(10)
$$ {\text{MAE}} = \frac{1}{n}\sum\limits_{i = 1}^{n} {\left| {H_{i,m} - H_{i,o} } \right|} $$
(11)
$$ {\text{MARE}} = \frac{1}{n}\sum\limits_{i = 1}^{n} {\left| {\frac{{H_{i,m} - H_{i,o} }}{{H_{i,m} }}} \right|} $$
(12)
$$ {\text{MAPE}} = \frac{1}{n}\sum\limits_{i = 1}^{n} {\left| {\frac{{H_{i,m} - H_{i,o} }}{{H_{i,o} }}} \right|} *100 $$
(13)
$$ {\text{RMSE}} = \sqrt {\frac{1}{n}\sum\limits_{i = 1}^{n} {\left( {H_{i,m} - H_{i,o} } \right)^{2} } } $$
(14)
$$ {\text{RMSRE}} = \sqrt {\frac{1}{n}\sum\limits_{i = 1}^{n} {\left( {\frac{{H_{i,m} - H_{i,o} }}{{H_{i,m} }}} \right)^{2} } } $$
(15)
$$ {\text{RRMSE}} = \sqrt {\frac{{\frac{1}{n}\sum\nolimits_{i = 1}^{n} {\left( {H_{i,m} - H_{i,o} } \right)^{2} } }}{{\frac{1}{n}\sum\nolimits_{i = 1}^{n} {H_{i,m} } }}} *100 $$
(16)
$$ R^{2} = 1 - \frac{{\sum\nolimits_{i = 1}^{n} {\left( {H_{i,m} - H_{i,o} } \right)^{2} } }}{{\sum\nolimits_{i = 1}^{n} {\left( {H_{i,m} - \overline{{H_{m} }} } \right)} }} $$
(17)
$$ t{\text{ - stat}} = \sqrt {\frac{{(n - 1){\text{MBE}}^{2} }}{{{\text{RMSE}}^{2} - {\text{MBE}}^{2} }}} $$
(18)

where n is the number of the solar irradiation data, Hi,m is the ith estimated values, Hi,o is the ith observed values and \( \overline{{H_{m} }} \) is the estimated mean value.

Results and discussion

The statistical descriptive of diffuse solar radiation includes minimum, maximum, mean, standard deviation and coefficient of variance is shown in Table 3. The mean value of global and diffuse solar radiation is 2320 and 704.2 MJ/m2 day with a standard deviation of 394.4 and 282.35 MJ/m2 day, respectively. The coefficient of variance of H and Hd is 17 and 40%, respectively. The diffuse fraction and diffusion coefficient values ranged from 0.105 to 0.577 and 0.030 to 0.121 with a mean of 0.298 and 0.074, respectively. High values of kd and KD are observed in the months of April, May, June, July, August and September (Fig. 2).

Fig. 2
figure 2

Box plot of diffuse fraction (top left), diffusion coefficient (top right), sunshine ratio (bottom left) and clearness index (bottom right)

Table 3 Statistical descriptive of diffuse solar radiation

The values of sunshine ratio and clearness index varied from 0.205 to 0.298 and 7.105 to 16.401 with a mean of 0.251 and 12.826, respectively, where high values for both predictors (St and Kt) are observed between the months of January to December (Fig. 2). From the Fig. 3, it was observed that there is a significant negative correlation between kdSt (− 0.859), kdKt (− 0.722), KDSt (− 0.825), and KDKt (−0.577).

Fig. 3
figure 3

Scatter plots for diffuse fraction (kd) and diffusion coefficient (KD) with sunshine ratio (in the top) and clearness index (in the bottom)

The statistical indicators of the different models are given in Table 4. For higher modeling accuracy MBE, MAE, MARE, MAPE, RMSE, RMSRE, RRMSE and t-stat indicators should be closer to zero, but correlation coefficient (R2) should approach to 1 as closely as possible. From the statistical indicators, it can be seen that the estimated values of kd and KD are in good agreement with the measured values for the most models. The values of MBE are ranged from − 1.68 e−17 to 9.84 e−18 and − 1.81 e−18 to 1.06 e−18 for kd and KD, respectively. We observed that the MBE values for the different models are very small which means that the proposed models slightly over predict the estimated values. The values of t-stats obtained for the all proposed models are significantly lower than the critical value. In terms of MBE and t-stats, the models M40 for kd and M65 for KD show the lowest errors. However, in terms of MAE, MARE, MAPE, RMSE, RMSRE, and RRMSE, the cubic model for both kd and KD (M23 for kd and M63 for KD) shows excellent accuracy since the MAE, MARE, MAPE, RMSE, RMSRE, and RRMSE values are the lowest. The correlation coefficient (R2) is observed to be the highest for M23 (kd) and M63 (KD) among all the models proposed, which means that the estimated and observed data from the cubic equations shows a maximum closeness.

Table 4 Results of statistical indicators for all proposed models

The results of the statistical indicators show that the estimated diffuse fraction and diffuse coefficient values from the different proposed models are close to each other. Since not all the statistical indicators are in favor of a model, more appropriate combined statistical indicators which can yield a comparative performance of the proposed models need to be established. In this way, we used Global Performance Indicator (GPI) that represents multiplication of all used statistical indicators (Said and Dickey 1983; Despotovic et al. 2015; Jamil and Abid 2018).

$$ {\text{GPI}}_{i} = \sum\limits_{j = 1}^{9} {\alpha_{j} } \left( {y_{j} - y_{ij} } \right) $$
(19)

where αj is weight factor and equals 1 for all indicators, while correlation coefficient (R2) is equal to − 1. yj is the median of the scaled values of indicator j, and yij is the scaled value of indicator j for model i. A higher value of GPI indicates more accurate model leading to better estimations.

The values of GPI and ranking of the model are shown in Table 5. The GPI of the proposed models ranged from − 4.803 to 0.309 and − 5.70 to 0.23 for kd and KD, respectively. The highest GPI shows the best performing model. Form the Table 5, we observed among the models, 33% and 43% of the total models attain a positive GPI value and the other models have a negative GPI. The maximum value of GPI is observed for model 23 (GPI = 0.309) for kd models and model 63 (GPI = 0.230) for KD models. Thus, it can be inferred that the cubic model best estimates the e diffuse fraction and diffusion coefficient for the study area (Figs. 4, 5).

Table 5 The global performance indicator (GPI) and ranking for all proposed models
Fig. 4
figure 4

Diagnostic plots of the best model of kd (M23)

Fig. 5
figure 5

Diagnostic plots of the best model of KD (M63)

Conclusion

In this study, solar radiation data was used to evaluation the diffuse fraction and diffusion coefficient using sunshine ratio and clearness index as predictors in Tamanrasset station, Algeria. The results show that the high values of kd and KD are observed between the months of April to September; however, those of St and Kt are observed between the months of January to December. Significant negative correlation between kdSt, kdKt, KDSt, and KDKt. Forty models are proposed in order to estimate the diffuse fraction and diffusion coefficient using sunshine ratio and clearness index as predictors. Based on the values of different statistical indicators and GPI, the best models for diffuse fraction and diffusion coefficient are models 23 and 63, respectively.