Abstract

This study was aimed to evaluate the water quality and pollution sources in Sapanca Lake and its tributaries by applying multivariate statistical techniques to physicochemical parameters and toxic metals. For this purpose, the multivariate statistical methods such as principal component analysis (PCA) and absolute principal component score-multiple linear regression (APCS-MLR) model have been employed. It was tried to determine the seasonal pollution sources of physicochemical parameters and toxic metals obtained from 22 different sampling points between the years of 2015 and 2017. PCA was applied to the datasets, and 6 varimax factors describing 84%, 80%, 76%, and 79% of the total variance for each season were extracted. The obtained factors were analyzed using the APCS-MLR model for the apportionment of various pollution sources affecting physicochemical parameters and toxic metals. The results show that the natural soil structure, municipal-industrial wastewater, agricultural-atmospheric runoff, highways, and seasonal effects are the major pollution sources for toxic metals and physicochemical parameters. The material contribution of pollutant sources to toxic metals and physicochemical parameters was calculated and verified by the concentrations analyzed. Consequently, multivariate statistical techniques are useful to determine the physicochemical parameters and toxic metals through reciprocal correlation and assess the seasonal impact of pollutant sources in the basin. This study also provides a basis for the creation of measurement programs, determination of pollution sources, and provision of sustainable watershed management regarding other water resources.

1. Introduction

Heavy metals can be classified as toxic elements (Cd, Co, Cr, Ni, Zn, etc.) and metalloids (As, Se, etc.). Since they are toxic and persistent and have bioaccumulative abilities on living organisms in terms of long-time exposure at even low concentrations, they are very dangerous [1, 2]. Heavy metals display accumulation feature in the form of increasing concentrations in living organisms and have great impact on the ecosystem. The main sources of heavy metals in aquatic environments are naturally the geological erosion of minerals and soil leaching in addition to the anthropogenic sources arising from traffic, industrial, and agricultural resources [3, 4]. The heavy metal pollution of aquatic ecosystems is increasing due to the effects stemming from agricultural runoff and industrial and domestic sewage discharge.

Lakes and rivers have always played an important role in supplying fresh water for human beings. Lakes are the most sensitive water sources among surface waters in terms of pollution. The quality of water sources is essential for the health of human beings, animals, and plants [5]. So, monitoring and evaluating water quality is of the highest priority in water resource management. Many studies have been published on the evaluation of heavy metal pollution regarding different water sources in our country and throughout the world [69].

Different monitoring programs and statistical analysis techniques are used to determine pollution sources and assess the quality of water resources. The multivariate statistical techniques such as cluster analysis (CA), principal component analysis (PCA), factor analysis (FA), and absolute principal component score-multiple linear regression (APCS-MLR) model have been widely accepted as efficient tools in terms of data analyzing and interpreting. In complex data matrices, these methods are used for the purpose of (1) better understanding of water quality and ecological levels of aquatic systems and (2) determining different changes caused by natural and anthropogenic factors depending on seasonality [1012]. In particular, the APCS-MLR model could offer quantitative information about the contribution of each pollution source type in terms of each water quality variable [13]. According to previous studies, statistical analysis techniques offer effective and reliable methods for the management of water resources and assessment of water quality [12, 1416].

In this study, the water quality in Sapanca Lake and its tributaries was interpreted in terms of pollution sources of physicochemical parameters and toxic metals (Al, As, Ba, Cd, Co, Cr, Cu, Fe, Ni, Pb, and Zn) via multivariate statistical techniques such as PCA and APCS-MLR model. Contribution of each toxic metal to each pollution source was calculated and verified by taking the toxic metal concentrations into consideration. The main pollution sources for toxic metals in the Sapanca Lake Basin were determined.

2. Materials and Methods

2.1. Study Area

The Sapanca Basin, Turkey, is located from 40°41″N to 40°44″N and 30°09″E to 30°20″E in Sakarya, in the Marmara Region of Turkey. Sapanca Lake plays an important role in the Marmara Region because it supplies freshwater for human consumption, agricultural needs, and industrial and recreational purposes in Sakarya Province and Izmit Province. Also, the E-80 TEM Anatolian (Ankara–Istanbul) highway, which surrounds Sapanca Lake, is one of Turkey’s busiest motorways. It is a partially deep lake ecosystem on a tectonic hole, and it is located between Izmit Bay and the Adapazarı Plain. While the surface area of the lake is 45 km2, its mean depth is 28.5 m, length is 16 km, and width is 5.5 km. In addition, the lake’s maximum depth is 60 m.

2.2. Sample Collection

Water samples were collected from 22 monitoring points between the years of 2015 and 2016 in Sapanca Lake. The monitoring points were identified to represent the entire lake basin (Figure 1). Water samples were obtained from the lake and streams, respectively, based on TS 6291 January 1989 “Water Quality—Sampling Part 4: Sampling Rules from Lakes and Ponds” and TS EN ISO 5667-6 March 2008 “Water Quality—Sampling—Part 6: Guide to Sampling from Rivers and Streams” [17, 18].

2.3. Analysis and Quality Control

Water samples were analyzed according to Standard Method 3030E, and samples taken in PE storage containers were stored in the refrigerator at 4°C until they become readable by using an ICP-MS (inductively coupled plasma-mass spectrometer). All plastic and glass materials used in the sampling process were kept in a 10% v/v HNO3 solution overnight and then washed with distilled water. In addition, all the acids and reagents that were used for analysis were of analytical grade and purchased from Merck & Co., Inc., Turkey. MERCK Suprapur® HNO3 and deionized water were used for the preparation and preservation of the solutions. In order to ensure the quality of the analysis, laboratory quality assurance and control methods (reagent blanks, calibration with standards, and analysis of replicates) were carried out. All analyses were carried out in triplicate, and the results were expressed as a mean. Furthermore, quality assurance was controlled by the use of standard reference materials (ERM-CA011 Hard Drinking Water-Metals). A recovery value over 95% was obtained for each heavy metal. Therefore, the average value of each water sample was used for further interpretation. Heavy metals (Al, As, Ba, Cd, Co, Cr, Cu, Fe, Ni, Pb, and Zn) were analyzed by using an ICP-MS. Water temperature (T), electrical conductivity (EC), pH, total dissolved solids (TDS), salinity (SAL), and dissolved oxygen (DO) of the lake water were measured on-site, in situ using a YSI Professional Plus field meter. Suspended solids (SS) and TOC (total organic carbon) analyses were performed according to Standard Method 2540D and Standard Method 5310b, respectively. Basic statistics of analysis results were calculated and are given in Table 1. Standard reference materials were used for calibration and quality assurance for each sampling and analysis. The same laboratory equipment was used for the measurement of all samples to minimize irregularities due to sampling and measurement changes.

2.4. Multivariate Statistical Methods

Principal component analysis (PCA) is one of the methods of factor analysis (FA) that reduces multiple data by extracting much variability among the attributes [19]. In PCA, the variability is shown by the extraction of principal components. PCA begins with a covariance matrix which explains the distribution of the original variables and extracts eigenvalues and eigenvectors. An eigenvector is a list of coefficients that multiply the original correlated variables to obtain new unrelated variables with the main components, which are weighted linear combinations of the original variables [20]. PCA was employed on the dataset to compare the compositional patterns between the examined water systems and identify the factors that influence each other [21].

APCS-MLR (absolute principal component score-multiple linear regression) is a model of multiple linear regression based on the factor scores derived from principal component analysis. It has also been widely used in recent years in order to identify sources that cause water pollution [14, 21]. The APCS-MLR model equation is shown as

Each j is used to determine the linear contribution of each pollutant source to the concentration of each pollutant parameter at the examined measuring point. Zjk is the concentration of variables, j number of pollutant sources, factor loads, factor score values, and j source component in the k observation. The normalization concentration of the variables obtained with the above equation is transformed into absolute nonnormalized component scores [22].

3. Results and Discussion

3.1. Principal Component Analysis

The physicochemical parameters and toxic metal analysis results obtained from Sapanca Lake and its tributaries between the years of 2015 and 2017 were statistically evaluated. The dataset obtained for the seasonal determination of physicochemical parameters and the interrelationships of toxic metals was split into 4 datasets—winter, spring, summer, and autumn. PCA/FA was applied to each seasonal dataset and evaluated separately.

In the analysis for winter, 19 parameters formed six PCs and explained 84.087% of the total variance (Table 2). The first component (PC1), which explained the highest variance of 27.838%, had strong positive loadings on Cd, Co, Cu, and Pb, moderate loadings on Cr, and negative loadings on pH. The second component (PC2), which accounted for 20.824% of total variance, had strong positive loadings on TDS, SAL, EC, and Ba. The third component (PC3) constituted 13.585% of total variance. It had strong positive loadings on Al, Fe, and Ni. The fourth (PC4), fifth (PC5), and sixth (PC6) components explained 8.598%, 6.990%, and 6.252% of total variance, respectively, and had moderate loadings on T, TOC, and As and strong loadings on Zn and SS. In the spring season, six PCs explained 80.388% of the total variance (Table 2). PC1 constituted 27.838% of total variance and had strong positive loadings on TDS, SAL, and EC. PC2, which explained 18.463% of total variance, had strong positive loadings on Zn, Cu, SS, and Ba. PC3 accounted for 13.464% of total variance and had positive loadings on pH and Cd and negative loadings on Co. PC4, PC5, and PC6 constituted 11.080%, 9.586%, and 8.376% of total variance, respectively, with moderate loadings on Cr and Ni; T, Al, Fe, and Pb; and As and TOC.

In the summer season, 19 parameters formed the six PCs and explained 76.048% of the total variance (Table 2). PC1 had positive loadings on Zn, Ni, Cu, and Cr with 17.892% of total variance. PC2 constituted 13.807% of total variance and had positive loadings on T and negative loadings on Cd and Pb. PC3 formed 13.373% of total variance and had loadings on Ba, SAL, and EC. PC4, PC5, and PC6 accounted for 12.946%, 9.284%, and 8.347% of total variance, respectively. PC4 had positive loadings on pH and As and negative loadings on SS. PC5 and PC6 had positive loadings on Al, Co and TDS, TOC, Fe, respectively. In the autumn season, six PCs explained 79.956% of the total variance (Table 2). PC1, which constituted the highest variance, had strong positive loadings on TDS, EC, and SAL, moderate loadings on Ba, and negative loadings on pH and As. PC2 accounted for 12.901% of total variance and had positive loadings on T. PC3 formed 12.548% of total variance and had positive loadings on SS, Al, Co, and Zn. PC4, PC5, and PC6 expressed 12.546%, 10.202%, and 9.916% of total variance, respectively. PC4 had strong positive loadings on Cd and Pb. PC5 and PC6 had positive loadings on Cu, TOC, Fe and Ni, Cr, respectively.

3.2. Source Apportionment

APCS-MLR was used to determine the linear contribution of each pollution source component to the concentration of each contaminant measured. APCSs were obtained using the factor scores derived from the PCA/FA. The MLR model was applied with regard to the concentrations of the variables, and each score value was obtained. Pollutant sources and the contribution of each pollutant source to each variable were determined according to the coefficients at a 95% significance level. The tables indicating the contribution of each pollutant source to each parameter and the figures showing the percentages of the material proportions of the sources are given below separately for winter, spring, summer, and autumn.

The model results in relation to the possible sources affecting toxic metals and physical parameters were described for winter (Table 3; Figure 2). APCS-MLR modeling showed that most of the parameters are affected by sources identified as lithogenic, industrial, domestic, seasonal, traffic, and agricultural. Unidentified sources for toxic metals and physical parameters affecting lake water quality were defined as “other sources.” It is known that the majority of metals are likely to pass through the suspended solid content into the water during rainy periods. This means that, in the winter months, with the increase in the surface flow within the basin, metals were transported [4, 23, 24].S1. Industrial sources showed heavy metal contribution due to anthropogenic activities in the Sapanca Lake basin. This source stems from the existence of facilities operating in the region such as fuel (or coal) use, galvanizing, lead paint use, and metal production [25]. Anthropogenic activities are identified as imprinting the pollutant source of toxic metals in water, sediment, and soil. Toxic metals such as Cd, Co, Cr, Cu, Ni, and Pb are thought to originate from metal-processing and spare part-manufacturing companies. In addition, there is a high consumption of fuel in industrial enterprises operating in the region [14]. Especially in the winter months, it is thought that the effluents of industrial activities also affect the water temperature.S2. Lithogenic sources explain the mineral structure of the region’s soil. Ba, Fe, TDS, SAL, and EC come from the soil structure during winter. These metals are usually found in the components of the earth’s crust. In the winter months, the values in the lake increase due to precipitation and infiltration [26].S3. Traffic and lithogenic sources are explained with the presence of the highways and mineral structure of the region. The TEM Anatolian highway (E-80) and railway pass south of Lake Sapanca, and D100 (E-5) passes in the north. The E-5 road and the TEM Anatolian highway are the most important pollutant sources with regard to Ba and Ni and derive from the combustion of fuel oil and the brake linings on these roads [27, 28]. The others (Al and Fe) stem from the soil structure [29].S4. Domestic and industrial pollution sources are especially identified by TOC. The TOC provides specific information on the types and sources of organic loads in the water. TOC, COD (chemical oxygen demand), and BOD5 (biological oxygen demand), together or individually, are important total parameters for assessing the organic load of water [30]. This resource is influenced by domestic and industrial organic pollution sources in the surrounding area.S5. Agricultural activities such as the use of fertilizers are generally the major source of As, Cr, Ni, and Zn. The long-term use of fertilizers caused serious nonpoint pollution due to irrigation water return and precipitation in the water body at the edge of agricultural land [31]. In particular, Zn moves from soil to the water body as a result of agricultural runoff in winter. Since suspended solids and dissolved and particulate metals enter the lake due to surface runoff arising from the increasing precipitation in the winter months, S6 can be defined as seasonal sources.

Possible sources affecting toxic metals and physical parameters in spring are shown in Table 4 and Figure 3.S1. The lithogenic source is described as the mineral structure of soil in the lake basin. TDS and EC result from the structure of the soil transported by surface runoff.S2. Traffic sources are explained in terms of roadside dust and the brake linings of vehicles. Pollutants such as Ba and Zn arising from the roads that accumulate in the atmosphere can reach the lake together with runoff, and they can also be transported with SS.S3. Industrial and agricultural sources are characterized by high loads of Pb, Cd, Co, Ni, and Zn, and especially pH. These metals are mainly due to industrial activities such as metal-processing/paint and agricultural activities and the use of pesticides [25, 32].S4. Cr is generally caused by industrial activities such as metal and chemical production [33].S5. Lithogenic and traffic sources stemmed from the mineral structure of the region’s soil and vehicular activity, especially leading to high concentrations of Al, Fe and Pb, Zn.S6. Domestic sources are explained with TOC in spring because of the residential areas in the region.

Toxic metals and physical parameters in terms of possible pollution sources are explained for summer (Table 5; Figure 4). Since the amount of rainfall is low in the summer period, the sources are generally more dominant in terms of point pollutant sources. Nonpoint pollutant sources usually occur, while the irrigation water returns in the summer months.S1. Industrial sources are explained with regard to Cd, Co, Cr, Cu, Ni, Pb, and Zn due to anthropogenic activities in the Sapanca Lake Basin, especially in summer.S2. The seasonal source is caused by water temperature.S3. Traffic sources are explained with regard to Ba and EC due to roadside dust.S4. Domestic and agricultural sources are characterized by TOC, As, and SS. They originate from irrigation water return and wastewater discharge.S5. The lithogenic source is generally caused by the soil structure and includes Al.S6. Lithogenic and traffic sources are due to soil and the fuel consumption of vehicles.

Possible pollution sources of toxic metals and physical parameters are described according to autumn (Table 6; Figure 5).S1. Traffic and lithogenic sources lead to the existence of As, Ba, EC, TDS, SAL, and pH in the lake water. As and Ba mostly depend on the fuel consumption of vehicles. In autumn, As and Ba are caused by crustal components and traffic sources including fuel consumption, respectively. EC, TDS, and SAL explain the mineral structure of the lake water.S2. The seasonal source is caused by water temperature.S3. Lithogenic and agricultural sources generally are explained by the soil structure including Al and pesticides including Zn and Co.S4. Traffic sources are explained with regard to Pb, As, Cd, and Ba because of the presence of vehicles.S5. The lithogenic source originating from Al and TOC is due to the soil structure.S6. Industrial and agricultural sources are identified by Cr, Ni, and Zn as has been reported by Song et al. [11, 34].

Ba has been determined as a parameter that significantly affects the water quality in the basin. At the same time, the Ba parameter was found to be highly correlated with TDS, EC, SS, and SAL. Therefore, it can be stated that the Ba parameter originates from the traffic load on both sides of Lake Sapanca and reaches the lake with the surface flow. The relationship of other toxic metals, except for Ba, with physicochemical parameters was found to be weak. Besides, all toxic metals have negative correlation with pH. The reason of the negative/positive correlations is that the metals reach surface waters as point or nonpoint pollutant sources according to seasons.

4. Conclusion and Management Approach

In this study, multivariate statistical methods were used in order to interpret the correlation and pollution sources’ contribution to physicochemical parameters and toxic metals in the Sapanca Lake Basin. The PCA/FA method was used to identify the factors responsible for toxic metal pollution and physicochemical effects in the basin. The APCS-MLR model has allowed us to identify the pollution sources contributing to the toxic metal pollution of the lake. The main sources responsible for toxic metal pollution and physicochemical parameters regarding the lake water quality were determined as lithogenic, agricultural activities, industrial and domestic wastewater, traffic load around the lake, and seasonal change. The material contribution of each toxic metal and physicochemical parameter to the contaminant sources was calculated and verified in terms of the measured values. As the dataset is seasonally interpreted, the seasonal effects of point and nonpoint loads have also been revealed more clearly.

Seasonal determination and evaluation of pollution sources are very important for the management of lakes. Seasonal approaches to pollutants provide information about whether the pollution sources are point and/or nonpoint sources. Moreover, these approaches facilitate the responsible institutions to improve the pollution sources and intervene in the source of pollutants. With this study, it has been determined that the toxic metal contamination due to nonpoint pollution sources is more common, especially in winter and spring, depending on precipitation. The Ba parameter, which is conveyed through the lake by the surface flow, defines the traffic load on both sides of Lake Sapanca during these seasons. It can be said that the toxic metals contained in soil and pesticide such as Fe, Al and Zn, Co flow into the lake via surface runoff. It was also found that nonpoint pollution during summer and autumn stemmed from agricultural runoff via return of irrigation water. In particular, this study clarified that responsible institutions and municipalities should focus on traffic and agricultural pollutants for sustainable management of sources and protection of Sapanca Lake. The use of a large amount of pesticides can be prevented by including farmer training programs in the local government action plans to be carried out in relation to the lake. In order to minimize the toxic metal effect caused by the traffic load, surface flows can be controlled by building a drainage channel to the roadsides of the lake.

This study demonstrates that multivariate statistical techniques can be used to interpret large datasets, determine the parameters responsible for contamination, and evaluate basin-based pollutants. In lakes that are used for drinking water supply such as Sapanca Lake, control of pollution in the source, protection of existing water quality, and transfer to the next generations with confidence are important within the scope of sustainable water management.

Data Availability

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the Turkey Scientific and Technological Research Institute (TUBITAK-3001 project) under project no. 115Y357, and the authors thank http://www.acedemicproofreading.com for proofreading.