Evaluation of Regression Analysis and Neural Networks to Predict Total Suspended Solids in Water Bodies from Unmanned Aerial Vehicle Images

Transcript

1 sustainability Article Evaluation of Regression Analysis and Neural Networks to Predict Total Suspended Solids in Water Bodies from Unmanned Aerial Vehicle Images 2,3,4, 2,3 2 1 , Eniuce M. Souza í cio R. Veronez , * , Emilie C. Koste , Maur ã T. Guimar es á Tain 2,3 2,3 1 and Frederico F. Mauad Diego Brum , Luiz Gonzaga Jr. 1 ã Graduate Programme in Environmental Engineering Sciences, S o Carlos Engineering School, University of ã ã o Carlos 13566-590, Brazil; [email protected] (T.T.G.); mauad ff [email protected] (F.F.M.) S o Paulo, S 2 ã o Leopoldo 93022-750, Advanced Visualization & Geoinformatics Lab—VizLab, Unisinos University, S Brazil; [email protected] (E.C.K.); [email protected] (E.M.S.); [email protected] (D.B.); [email protected] (L.G.J.) 3 Graduate Programme in Applied Computing, Unisinos University, S ã o Leopoldo 93022-750, Brazil 4 Graduate Programme in Biology, Unisinos University, S ã o Leopoldo 93022-750, Brazil * Correspondence: [email protected]; Tel.: + 55-51-3591-1100 (ext. 1619)    Received: 14 March 2019; Accepted: 22 April 2019; Published: 5 May 2019 The concentration of suspended solids in water is one of the quality parameters that can Abstract: be recovered using remote sensing data. This paper investigates the data obtained using a sensor coupled to an unmanned aerial vehicle (UAV) in order to estimate the concentration of suspended solids in a lake in southern Brazil based on the relation of spectral images and limnological data. The water samples underwent laboratory analysis to determine the concentration of total suspended solids (TSS). The images obtained using the UAV were orthorectified and georeferenced so that the values referring to the near, green, and blue infrared channels were collected at each sampling point to relate with the laboratory data. The prediction of the TSS concentration was performed using regression analysis and artificial neural networks. The obtained results were important for two main reasons. First, although regression methods have been used in remote sensing applications, they may not be adequate to capture the linear and or non-linear relationships of interest. Second, results show / that the integration of UAV in the mapping of water bodies together with the application of neural networks in the data analysis is a promising approach to predict TSS as well as their temporal and spatial variations. Keywords: suspended solids; unmanned aerial vehicle; spectral imaging; artificial neural networks 1. Introduction The typical methodology for investigating water quality involves collecting water samples directly from various locations and laboratory analyses. While this method may result in accurate assessments of water body quality with limited areas, it is time consuming and expensive, and di ffi cult to apply in large areas. Moreover, because the results are punctual, they do not necessarily reflect the quality of the whole site [1,2]. Alternative measures for in situ monitoring of water quality in lakes, dikes, and reservoirs can be obtained by means of remote sensing techniques. Such an application is only possible due to the presence of optically active components in the water. These substances can be identified via sensor systems in that their presence in a water body results in di ff erent absorption and backscattering patterns of the incident light, which are characteristic of each component. Among the parameters of water sustainability 2019 , 11 , 2580; doi:10.3390 / su11092580 www.mdpi.com / journal / Sustainability

2 Sustainability 2019 11 , 2580 2 of 13 , quality, suspended inorganic sediments, organic chlorophyll-a, and dissolved organic material are the main agents of absorption and scattering of electromagnetic radiation in a water body [3,4]. It should be noted that these components are directly related with the quality of the aquatic ecosystem and its surroundings. For example, total suspended solids (TSS), which represents the 5 total amount of inorganic or organic particles drifting or floating in water [ ], may be related to water pollution since these can serve as a transporting and storage agent of various pollutants, as ]. TSS 4 well as erosive processes in a river basin (resulting in silting of major rivers and reservoirs) [ concentration is often related to total primary production, heavy-metal and micro-pollutant flows, and in many turbid regions, is directly linked to sediment transport problems and the light available for primary production [6]. An indirect measurement of TSS in water bodies via remote sensing can compensate for deficiencies in manual water quality monitoring by being fast, allowing for continuous monitoring of large 2 , 7 , 8 ]. Most of the studies published on the TSS prediction from remote sensing involve the areas [ use of spectral data retrieved from satellite images. Because of its medium spatial resolution (30 m), in the studies of remote water sensing, one the most common satellites are Landsat, such as found 2 ], Kong et al. [ in Qun et al. [ ], Din et al. [ 9 ], and Amanollahi et al. [ 10 ]. Song et al. [ 6 ] tested the 8 images for medium spatial resolution IRS-P6 (Indian Remote Sensing Satellite) as well. Besides these, , Breuning et al. [ ], and Moridnejad et al. [ 12 ] used MODIS (Moderate Resolution Wang et al. [7] 11 3 ] of the MERIS (Medium Resolution Imaging Spectroradiometer) satellite images and Campbell et al. [ Imaging Spectrometer), with these having low spatial resolutions (250 and 300 m, respectively) and limited to large areas of water. Thus, although remote sensing serves as a powerful technique for monitoring environmental and seasonal changes, and its ability to remotely monitor water resources has increased in recent decades because of the improved quality and availability of satellite imagery data [ 13 ], the analysis of small water bodies may not be adequate due to the medium image resolution of the most usual commercial satellites [ 1 ]. In this case, the use of aerial images obtained by unmanned aerial vehicle (UAVs) for monitoring small bodies of water has presented good results and becomes promising for producing greater detail due to high spatial resolution and the possibility of constant monitoring [14,15]. Although some applications of UAVs for water quality parameters monitoring, such as ], have been demonstrated , 15 – 17 ], organic matter [ 18 ], and suspended solids [ 1 , 18 – chlorophyll-a [ 1 20 in the literature, there are still few studies focused on this application. For suspended solids monitoring, ] used regression analyses between TSS values 18 é nz et al. [ for example, Veronez et al. [ ] and Sa 19 measured in the laboratory and the UAV responses in the visible and near infrared (NIR) regions to generate their prediction models. Saenz et al. [ 19 ] explored relations between individual bands and combinations between them (as NIR-red, for example), and Veronez et al. [ 18 ] chose to relate to vegetation indexes such as normalized di ff ff erence erence vegetation index (NDVI) and normalized di water index (NDWI). Although both mentioned studies have shown positive results, the modeling of these parameters in complex environments is not always possible through regression analysis. Therefore, cognizant of the limitations that techniques such as regression analysis has, the need for research and improvement of inland waters monitoring techniques integrated with the facilities provided by the technologies developed and available in the market can be seen. Among these modern techniques that can provide the support for the monitoring of waters via remote sensing is the artificial intelligence with the use of neural networks. Approaches involving neural networks are promising in the area of remote sensing and the development of water quality models because they can be more sensitive and robust than other traditional regression techniques, with the ability to capture both linear and non-linear relationships , between the involved parameters [ , 12 , 13 8 21 ]. However, results presented in the literature on artificial neural networks (ANN) approaches in water bodies use mainly satellite imagery of low [ 12 ] to medium [8,13,21] spatial resolution.

3 Sustainability , 2580 3 of 13 , 2019 11 , 11 2019 Sustainability 3 of 13 , x FOR PEER REVIEW No papers were found that included the application of artificial neural networks to the analysis of ed the application of artificial neural networks to the analysis No papers were found that includ high spatial resolution images obtained using UAVs, and therefore, this study intends to fill this gap. of high spatial resolution images obtained using UAVs, and therefore, this study intends to fill this The aim of this article was to use remote sensing technologies to evaluate water quality, identifying an gap. The aim of this article was to use remote sensing technologies to evaluate water quality, alternative method for monitoring and quantifying the concentration of suspended solids in water, identifying an alternative method for monitoring and quantifying the concentration of suspended through the correlation between UAV images and limnological data using regression analysis (RA) UAV images and limnological data using regression solids in water, through the correlation between and artificial neural networks (ANN). Furthermore, this study aims to contribute to the development analysis (RA) and artificial neural networks (ANN). Furthermore, this study aims to contribute to the of temporal and spatial water quality monitoring techniques through modern remote sensing tools y monitoring techniques through modern remote development of temporal and spatial water qualit and artificial intelligence. sensing tools and artificial intelligence. The manuscript is structured as follows: Section 2 contains the information about the field site, the The manuscript is structured as follows: Section 2 contains the information about the field site, acquisition of the data, and its subsequent analyses; in Section 3, we present and discuss the results of the acquisition of the data, and its subsequent an alyses; in Section 3, we present and discuss the the research on the concentration of TSS, regression models, and ANN; and finally, Section 4 presents results of the research on the concentration of TSS, regression models, and ANN; and finally, Section our conclusions regarding the study, with indications of its importance and its continuity. 4 presents our conclusions regardin g the study, with indications of its importance and its continuity. 2. Materials and Methods 2. Materials and Methods The method that we are proposing can be structured according to the following steps: GNSS (Global The method that we are proposing can be struct ured according to the following steps: GNSS Navigation Satellite System) data acquisition, water sampling and laboratory analysis, overflight with (Global Navigation Satellite System) data acquisition, water sampling and laboratory analysis, the UAV and processing of the images, extraction of values from images UAV, regression analysis, and overflight with the UAV and processing of the im ages, extraction of valu es from images UAV, training and testing of the ANN. The flowchart of the proposed method is depicted in Figure 1 and regression analysis, and training and testing of the ANN. The flowchart of the proposed method is detailed in the following subsections. depicted in Figure 1 and detailed in the following subsections. Figure 1. Flowchart of the proposed method. Figure 1. Flowchart of the proposed method. 2.1. Field Site 2.1. Field Site The adopted study site was the lake on the Unisinos University campus, located in the state of Rio The adopted study site was the lake on the Unisinos University campus, located in the state of 2 Grande do Sul, southern Brazil (Figure 2). The lake is artificial, has an area of approximately 0.025 km Rio Grande do Sul, southern Brazil (Figure 2). The lake is artificial, has an area of approximately 0.025 and maximum depth of 4 m. Although small, it is located at the lowest altitude of the campus, and km² and maximum depth of 4 m. Although small, it is located at the lowest altitude of the campus, because it is formed from rainwater drainage collected at the university, it contains several inorganic and because it is formed from rainwater drainage collected at the university, it contains several and organic compounds found in the form of suspended solids or organic matter from rainwater inorganic and organic compounds found in the fo rm of suspended solids or organic matter from [18]. ff runo rainwater runoff [18]. The lake and its surroundings also function as an ecosystem for several species of animals, such as ducks, geese, and several other birds, as well as a great diversity of fish. Because it is a university campus, the area has several buildings, paved areas, and a large circulation of people and cars.

4 Sustainability 11 4 of 13 , 2019 , 2580 The lake and its surroundings also function as an ecosystem for several species of animals, such as ducks, geese, and several other birds, as well as a great diversity of fish. Because it is a university 4 of 13 2019 , 11 , x FOR PEER REVIEW Sustainability campus, the area has several buildings, paved areas, and a large circulation of people and cars. However, the campus also has several vegetated areas, mainly around the lake, as can be seen in However, the campus also has several vegetated area s, mainly around the lake, as can be seen in Figure 2. Figure 2. Figure 2. Location of the study site. Figure 2. Location of the study site. Studies addressing the applicability of remote sensing in the monitoring of water bodies have Studies addressing the applicability of remote se nsing in the monitoring of water bodies have ] used spectral data collected in the field and already been developed in this area. Guimar ã es et al. [ 17 [17] used spectral data collected in the field and already been developed in this area. Guimarães et al. ], based on UAV images to model the chlorophyll-a concentration in the environment. Veronez et al. [ 18 ronez et al. [18], based n in the environment. Ve UAV images to model the chlorophyll-a concentratio UAV images, applied neural networks to estimate Landsat 8 OLI satellite bands and correlated this on UAV images, applied neural networks to estimate Landsat 8 OLI satellite bands and correlated with data on suspended solids and dissolved organic matter. this with data on suspended soli ds and dissolved organic matter. Studies including the characterization of this lake, the behavior of the limnological variables, and Studies including the characterization of this lake, the behavior of the limnological variables, their relationships with the remote sensing variables are important as they serve as pilot studies to be and their relationships with the remote sensing variables are important as they serve as pilot studies applied in larger water bodies. to be applied in larger water bodies. 2.2. Data Acquisition 2.2. Data Acquisition We performed two field samplings in March 2016 and 2017 during the transition period between the seasons of summer and fall. The collections were carried out in a single day and we ensured that We performed two field samplings in March 2016 an d 2017 during the transition period between the climatic conditions of both days were similar. The average temperatures were between 22 and 24 the seasons of summer and fall. The collections were carried out in a single day and we ensured that 1 − ◦ (southeast direction), and without the occurrence of precipitation C, winds with a speed of 0.4 ms r. The average temperatures were between 22 and 24 the climatic conditions of both days were simila − 1 events on the days of the collections. °C, winds with a speed of 0.4 ms (southeast direction), and without the occurrence of precipitation On the same days, the UAV overflew the area and in situ collection of water samples occurred such events on the days of the collections. that the two pieces of information could be compared as being representative of the same conditions On the same days, the UAV overflew the area and in situ collection of water samples occurred of the lake. Besides, possible temporal variations from one year to another can be evaluated for the compared as being representative of the same such that the two pieces of information could be collected data and compared to the predicted one from the analyzed RA and ANN methods. conditions of the lake. Besides, possible temporal variations from one year to another can be We selected 21 sample points, as shown in Figure 3, that were spatially distributed over the the predicted one from the analyzed RA and ANN evaluated for the collected data and compared to lake such that surface water samples (up to 0.5 m) were collected for the laboratory determination of methods. suspended solids using the gravimetric method described in the Standard Methods for the Examination We selected 21 sample points, as shown in Figure 3, that were spatially distributed over the lake of Water and Wastewater [22]. such that surface water samples (up to 0.5 m) we re collected for the laboratory determination of described in the Standard Methods for the suspended solids using the gravimetric method Examination of Water and Wastewater [22].

5 Sustainability 2019 , 2580 5 of 13 , 11 , Sustainability 5 of 13 11 2019 , x FOR PEER REVIEW Figure 3. Sample points used in the survey. Sample points used in the survey. Figure 3. The UAV used to take the images was the SenseFly, Swinglet CAM model (SenseFly Parrot Group, The UAV used to take the images was the Se nseFly, Swinglet CAM model (SenseFly Parrot Cheseaux-sur-Lausanne, Switzerland). It was coupled to a Canon ELPH 110HS (Canon U.S.A., Inc., was coupled to a Canon ELPH 110HS (Canon Group, Cheseaux-sur-Lausanne, Switzerland). It New York, NY, United States) camera with a 16-megapixel resolution and was factory-modified to U.S.A., Inc., New York, NY, United States) camera with a 16-megapixel resolution and was factory- capture the NIR band instead of the red band. Thus, mapping was in three distinct channels, namely: modified to capture the NIR band instead of the red band. Thus, mapping was in three distinct near infrared (NIR), green (G), and blue (B). channels, namely: near infrared (NIR), green (G), and blue (B). As well as the sampling points for water collection, in the field we also established and tracked As well as the sampling points for water collecti on, in the field we also established and tracked six ground control points (GCPs), through the GNSS (Global Navigation Satellite System), based on six ground control points (GCPs), through the GNSS (Global Navigation Satellite System), based on the RTK (real time kinematic) method, located in the area of coverage of the flight such that later their ea of coverage of the flight such that later their the RTK (real time kinematic) method, located in the ar positions were used in the georeferencing of the images obtained. positions were used in the georeferencing of the images obtained. The images obtained using the UAV were processed using the PIX4D software, version 2.1 (Pix4D essed using the PIX4D software, version 2.1 The images obtained using the UAV were proc S.A., Lausanne, Switzerland), in which the images were orthorectified and georeferenced, where we (Pix4D S.A., Lausanne, Switzerland), in which the images were orthorectified and georeferenced, adopted the SIRGAS 2000 (Geocentric Reference System for the Americas) as the reference system, in where we adopted the SIRGAS 2000 (Geocentric Refe rence System for the Americas) as the reference 22S projection zone. We generated orthophotos with a − the UTM (Universal Transverse Mercator) system, in the UTM (Universal Transverse Mercator) − 22S projection zone. We generated orthophotos pixel size of 5 cm × 5 cm. with a pixel size of 5 cm × 5 cm. 2.3. Data Analysis 2.3. Data Analysis In order to perform analysis between the data collected of the water quality and those obtained llected of the water quality and those obtained In order to perform analysis between the data co via remote sensing, we plotted the sample points where samples were collected in the orthophotos where samples were collected in the orthophotos via remote sensing, we plotted the sample points generated by the overfly with the UAV and extracted the values of the pixels concerning each point for generated by the overfly with the UAV and extracted the values of the pixels concerning each point 42), = the NIR, G, and B channels. We emphasize that among the collected values of the two years (n among the collected values of the two years (n = for the NIR, G, and B channels. We emphasize that four were disregarded because the points were located in a shaded area of the image. Thus, a sample 42), four were disregarded because the points were located in a shaded area of the image. Thus, a of 38 points was considered for the analysis. sample of 38 points was considered for the analysis. We used this data to predict the suspended solids concentration in Lake Unisinos using linear We used this data to predict the suspended soli ds concentration in Lake Unisinos using linear and non-linear regression analysis (RA) and artificial intelligence through an artificial neural network and non-linear regression analysis (RA) and artificial intelligence through an artificial neural network (ANN). The aim of this step was to identify a model to quantify the concentrations of suspended (ANN). The aim of this step was to identify a mo del to quantify the concentrations of suspended solids present in the water using the informatio n obtained through remote sensing. We evaluated

6 Sustainability 2019 11 , 2580 6 of 13 , solids present in the water using the information obtained through remote sensing. We evaluated their 2 performances using the following statistical metrics: coe ) and root mean ffi cient of determination (R square error (RMSE). Linear and non-linear regression models were investigated. The considered non-linear functions were exponential, logarithmic, quadratic, and power (range from ff erent 1 to 1). Knowing that di − erent responses for each wavelength, to predict TSS in concentrations of suspended solids present di ff RA models, we considered as independent variables each channel individually (NIR, G, and B) and the operations of bands (sum, subtraction, and ratios) to highlight the spectral characteristics of the 6 – 8 , compounds [ , 21 ]. Thus, besides the simple regressions with the covariates included individually, 13 multiple regressions were considered with two or more independent variables combined, taking care to avoid dependence among the covariates. Considering the sample size of this experiment, all 38 observations were used for RA modeling because the probabilistic assumptions of this class of models. After adjustment, the usual residual verifications related to the distribution (Gaussian), independence, and homoscedasticity were checked. If the estimated model was adequate, a cross-validation step could be performed where one observation at a time was left out of the adjustment for comparison or a sample part was reserved when the sample size was large. In the TSS prediction from the ANN, which is a distribution-free method, the neural network modeling considered two processing steps, the first being the training of the network, and the second ff erent from the first stage. In this study, we used 80% of being its subsequent testing with a data set di the data collected for ANN training and 20% for testing, which were randomly defined [23]. As the objective of the method was to create an ANN capable of recovering the concentration of suspended solids in the water from the bands of the modified Canon sensor incorporated into a UAV, we considered the normalized values of the NIR, G, and B channels as inputs to ANN, and the TSS concentration as an output at the same sampling point. We used a network of feed-forward backpropagation, with this being commonly used in remote sensing studies [8,10,21]. During the training phase, several tests were carried out in order to obtain the best ANN topology applicable to this study, choosing the network that provided the highest correlation coe cient and the ffi ff lowest mean square error during training and testing. We tested di erent numbers of neurons (from 5 to 20) in one single hidden layer, as well as three activation functions (sigmoid, tangent, and linear), and the number of training cycles. 3. Results and Discussion The results of the laboratory analyses were satisfactory for the research and compatible with prior knowledge of the water quality in the study area and analysis of the spatial behavior of these parameters, which would later be compared with the UAV images. Table 1 shows the descriptive statistics of the total suspended solids (TSS) analyzed in this research for March 2016 and 2017. Descriptive statistics of TSS. Table 1. Value (mg / L) Parameter 2016 2017 All Average 16.27 13.65 14.93 Standard deviation 2.97 3.29 3.07 11.67 9.33 Minimum value 9.33 Maximum value 23.75 20.00 23.75 Median 15.33 12.68 14.87 Variance Coe ffi cient (%) 19.77 23.65 23.02 We observed from the analysis of the data presented in Table 1 that the characteristics of the study lake were not the same between the two collections. This was also confirmed from the Wilcoxon test at

7 Sustainability 11 , 2019 , 2580 7 of 13 a 95% confidence level. There was a decrease in the concentration of suspended solids from 2016 to 11 , 2019 Sustainability 7 of 13 , x FOR PEER REVIEW 2017, which can be seen in the averages, medians, maximum, and minimum values of Table 1. erence in TSS concentration, although small, can be justified because although it did ff This di This difference in TSS concentration, although small, can be justified because although it did not not rain on the days of sampling in 2016 and 2017, there were rainfall events in the week before the rain on the days of sampling in 2016 and 2017, there were rainfall events in the week before the collection of 2016 (85 mm according to the experimental climatological station located at Unisinos collection of 2016 (85 mm according to the experimental climatological station located at Unisinos ] point out that in impermeable urban areas, the 24 University), which did not occur in 2017. Allen et al. [ University), which did not occur in 2017. Allen et al . [24] point out that in impermeable urban areas, flow of rainwater in the soil causes the collection of the pollutants and sediments from these surfaces, the flow of rainwater in the soil causes the coll ection of the pollutants and sediments from these which are transported to the nearest waterways. As the lake receives the drainage of rainwater from surfaces, which are transported to the nearest wate rways. As the lake receives the drainage of the university campus, it is expected that in rainy periods, various compounds will be carried into it, rainwater from the university campus, it is expected that in rainy periods, various compounds will increasing the concentration of suspended solids, for example. be carried into it, increasing the concentration of suspended solids, for example. As initial cartographic products, obtained via overflying with the UAV in 2016 and 2017, and by As initial cartographic products, obtained via ov erflying with the UAV in 2016 and 2017, and by processing the images, we have the orthophotos of the area, as shown in Figure 4. processing the images, we have the orthophotos of the area, as shown in Figure 4. Figure 4. Orthophotos generated by the overfly with UAV in March 2016 and 2017. Orthophotos generated by the overfly with UAV in March 2016 and 2017. Figure 4. The simple and multiple linear and non-linear RA described in the previous section were evaluated; The simple and multiple linear and non-linear RA described in the previous section were however, most of the results were unsatisfactory. Table 2 shows the best results that we obtained in evaluated; however, most of the results were unsa tisfactory. Table 2 shows the best results that we these analyses. Although not shown, residual analyses were performed to check the error assumptions. obtained in these analyses. Although not shown, re sidual analyses were performed to check the error Best results of the RA Model. Table 2. assumptions. 2 RMSE Function Variables R Table 2. Best results of the RA Model. 3.05 0.20 Power NIR Variables Function R² RMSE G 0.19 Exponential NIR 3.07 / G Linear 3.08 0.16 NIR Power 0.20 3.05 0.13 Linear B 3.14 G/NIR Exponential 0.19 3.07 B and G / NIR Multiple Linear 0.20 3.00 G Linear 0.16 3.08 / G 0.18 Multiple Linear 3.05 B / NIR and NIR B Linear 0.13 3.14 G and B 0.16 3.08 Multiple Linear B and G/NIR 0.20 Multiple Linear 3.00 G/NIR and NIR/B 0.18 3.05 Multiple Linear According to Table 2, the best adjustments of the simple regression analyses were for the NIR 0.16 3.08 G and B Multiple Linear / ], although both studies 10 ] and Amanollahi et al. [ 6 and G NIR variables, agreeing with Song et al. [ 10 ]). Also, 6 ] and above 0.9 for Song et al. [ obtained better results than ours (0.7 for Amanollahi et al. [ According to Table 2, the best adjustments of the simple regression analyses were for the NIR a combination of B and G / NIR was the best result for the multiple linear regressions. and G/NIR variables, agreeing with Song et al. [6] and Amanollahi et al. [10], although both studies obtained better results than ours (0.7 for Amanollahi et al. [10] and above 0.9 for Song et al. [6]). Also, a combination of B and G/NIR was the best result for the multiple linear regressions. Although the regression models in Table 2 showed statistical significance, the low R² values indicate that the RA models were not ideal for TSS recovery in the study area. This result can be

8 Sustainability 2019 11 , 2580 8 of 13 , 2 values Although the regression models in Table 2 showed statistical significance, the low R indicate that the RA models were not ideal for TSS recovery in the study area. This result can be explained by the optical complexity of the study waters such that the relations between the bands of the collected images and the concentration of TSS could not be explained by traditional regression techniques. ective. Kong et al. [ ] emphasize that To improve the accuracy of TSS predictions, ANN can be e 8 ff ANN models establish di ff erent weights for each input in the network and thus take full advantage of ff the characteristics of TSS included in the di erent bands. erent topologies. In Table 3, the results ff We performed several trainings of neural networks with di 2 include a coe ffi cient of determination (R ) greater than 0.5 in the training step and their respective topologies, activation functions, number of epochs, and time of training are presented. 2 > 0.5. Table 3. Results of the ANN training for R a c 2 b (s) Time Topology Epochs RMSE R Activation Function Tangent / 3-5-1 100 1 0.60 2.11 Linear 3-5-1 / Linear 400 4 0.57 2.11 Sigmoid Tangent Tangent 300 5 0.84 1.33 3-7-1 / Tangent / Tangent 300 4 0.50 2.32 3-10-1 3-10-1 Tangent Linear 400 5 0.50 2.30 / 3-10-1 / Linear 100 1 0.64 2.05 Sigmoid 3-10-1 Sigmoid / Linear 400 5 0.53 2.19 3-12-1 Tangent 200 3 0.65 2.02 / Tangent / 0.58 500 7 Tangent 2.10 3-15-1 Tangent Tangent Linear 400 5 0.53 2.21 3-15-1 / Sigmoid / Linear 300 4 0.50 2.27 3-17-1 Tangent Linear 100 1 0.55 2.20 / 3-20-1 Sigmoid / Linear 200 2 0.50 2.30 3-20-1 b a c with “input (3)-neurons-output (1)”. / output layer. Hidden layer Where the computer used had the following ® configuration: processor—Intel ™ i3-4005U CPU @ 1.70 GHz × 4, memory—4GB DDR3 1600MHz RAM. Core According to Table 3, the topology in which we obtained the smallest RMSE and the highest determination coe cient was 3-7-1, with the tangent function as the activation function, and with 300 ffi training cycles. Thus, the ANN adopted was a feed-forward backpropagation type, with three input layers (NIR, G, and B), seven neurons in a single hidden layer, and one output (TSS). As usual, the training processing time depended on the number of epochs and was not a problem in our experiment because of the reduced sample size. The results that we found in the training and testing steps for this best ANN are presented in Table 4. The graph presented in Figure 5 demonstrates the comparison between the data measured in the laboratory and those estimated through ANN. 2 The ANN training stage resulted in an R of 0.84 and RMSE of 1.33, while during testing, these values were 0.57 and 2.97, respectively. As shown in Table 4 and Figure 5, considering all the data 2 collected in this study as inputs to the ANN, the R was 0.75 and the RMSE was 1.81. As expected, the results showed a significant improvement in the prediction of suspended solids data in the study area through the use of ANN in place of the simple and multiple linear and non-linear investigated RA. Table 4. Results of the ANN. 2 n RMSE Steps R Training 30 0.84 1.33 Testing 8 0.57 2.97 All 38 0.75 1.81

9 Sustainability 2019 11 , 2580 9 of 13 , Sustainability , 11 , x FOR PEER REVIEW 9 of 13 2019 Comparison between TSS measurements and estimated values from the ANN model Figure 5. estimated values from the ANN model in Figure 5. Comparison between TSS measurements and in training (black) and testing (red) set using March 2016 (circular shape) and 2017 (triangular 2016 (circular shape) and 2017 (triangular shape) training (black) and testing (red) set using March shape) samplings. samplings. , 11 , 7 , 2 Although several studies show good results using regression methods to predict TSS [ ], 19 – 15 The ANN training stage resulted in an R² of 0.84 and RMSE of 1.33, while during testing, these ], Moridnejad et al. [ 10 6 ], Amanollahi et al. [ 12 ], and Wu et al. [ 25 ], compared others, such as Song et al. [ values were 0.57 and 2.97, respectively. As shown in Table 4 and Figure 5, considering all the data the two methodologies (RA and ANN) and obtained results indicating better quality in the prediction collected in this study as inputs to the ANN, the R² was 0.75 and the RMSE was 1.81. of the data through an ANN, signaling the capacity of the neural networks to model more complex As expected, the results showed a significant impr ovement in the prediction of suspended solids ] reported that an ANN did not and non-linear relations between the parameters. Only Kong et al. [ 8 data in the study area through the use of ANN in place of the simple and multiple linear and non- present better results than regression methods for TSS predictions in their area of study. linear investigated RA. Din et al. [ 9 ] used statistical correlation analysis only as a support for choosing the ideal bands g regression methods to predict TSS [2,7,11,15– Although several studies show good results usin of the Landsat 8 OLI satellite for an ANN input. Then, the authors decided to include also the 19], others, such as Song et al. [6], Amanollahi et al. [10], Moridnejad et al. [12], and Wu et al. [25], bands of the short-wave infrared (SWIR-1 and SWIR-2) as inputs, which is not common in papers compared the two methodologies (RA and ANN) and obtained results indicating better quality in about ANN for predicting water quality parameters since only visible and near infrared regions are ing the capacity of the neural networks to model the prediction of the data through an ANN, signal exploited [6,8,10,12,13,21,25,26]. more complex and non-linear relations between the pa rameters. Only Kong et al. [8] reported that an Although the aforementioned approaches from the literature are similar to our paper for comparing an regression methods for TSS predictions in their area of study. ANN did not present better results th RA and ANN for the prediction of TSS, we point out that our results di er and are highlighted by the ff Din et al. [9] used statistical correlation analysis only as a support for choosing the ideal bands high spatial resolution of the UAV images used in comparison to low or medium spatial resolutions of of the Landsat 8 OLI satellite for an ANN input. Then, the authors decided to include also the bands the satellite images of other studies. Thus, our method allows for giving more geographically accurate of the short-wave infrared (SWIR-1 and SWIR-2) as inputs, which is not common in papers about TSS predictions because of the small pixel size of the UAV images (5 cm in comparison to 30 m for ANN for predicting water quality parameters since only visible and near infrared regions are Landsat, for example) and generating high quality and resolution TSS monitoring maps. exploited [6,8,10,12,13,21,25,26]. Finally, the ANN model was used to predict the TSS concentration for the whole lake using the Although the aforementioned approaches from the literature are similar to our paper for NIR, G, and B variables for the 2016 and 2017 UAV images. Thus, the generated TSS maps for Lake comparing RA and ANN for the prediction of TSS, we point out that our results differ and are Unisinos are shown in Figure 6. highlighted by the high spatial resolution of the UAV images used in comparison to low or medium While analyzing Figure 6, we noticed the highest concentrations of suspended solids in the 2016 spatial resolutions of the satellite images of other studies. Thus, our method allows for giving more sampling compared to the 2017 one, a situation that was already indicated in Table 1. Besides, the geographically accurate TSS predictions because of the small pixel size of the UAV images (5 cm in ff used data set presented a significant statistical di erence between the two years, where the spatial comparison to 30 m for Landsat, for example) and generating high quality and resolution TSS distribution also became evident in Figure 6. The highest concentrations of TSS in 2016 were in the monitoring maps. lower central region of the lake, whereas in 2017, they were near the lower-right margin. A large Finally, the ANN model was used to predict the TSS concentration for the whole lake using the erence in part is found in the center of the lake for 2017 with the minimum TSS values. This di ff NIR, G, and B variables for the 2016 and 2017 UAV images. Thus, the generated TSS maps for Lake spatial distribution, mainly showing as a large TSS concentration in 2016, is consistent with the in situ Unisinos are shown in Figure 6.

10 , 10 of 13 , x FOR PEER REVIEW 11 2019 Sustainability Sustainability , 11 2019 , 2580 10 of 13 collected water samples and is also explained by the rain that occurred in the previous week of the field collection in 2016. Figure 6 also shows that TSS concentrations were in the same range (9.33 to L) for both years. In this sense, to verify if the statistical characteristics of the prediction, data / 23.75 mg remain close to the observed ones, where Figure 7 shows the box plot of TSS concentrations for both , Sustainability 10 of 13 , x FOR PEER REVIEW 11 2019 observed and predicted values in March 2016 and 2017. Maps of predicted TSS based on th Figure 6. e ANN model for March, 2016 and 2017. t concentrations of suspended solids in the 2016 While analyzing Figure 6, we noticed the highes sampling compared to the 2017 one, a situation that was already indicated in Table 1. Besides, the used data set presented a significant statistical di fference between the two years, where the spatial distribution also became evident in Figure 6. The highest concentrations of TSS in 2016 were in the lower central region of the lake, whereas in 2017, th ey were near the lower-right margin. A large part is found in the center of the lake for 2017 with the minimum TSS values. This difference in spatial distribution, mainly showing as a large TSS concentration in 2016, is consistent with the in situ e rain that occurred in the previous week of the collected water samples and is also explained by th TSS concentrations were in the same range (9.33 to field collection in 2016. Figure 6 also shows that sense, to verify if the statistica l characteristics of the prediction, 23.75 mg/L) for both years. In this re 7 shows the box plot of TSS concentrations for data remain close to the observed ones, where Figu both observed and predicted values in March 2016 and 2017. Figure 6. Maps of predicted TSS based on the ANN model for March, 2016 and 2017. e ANN model for March, 2016 and 2017. Maps of predicted TSS based on th Figure 6. t concentrations of suspended solids in the 2016 While analyzing Figure 6, we noticed the highes sampling compared to the 2017 one, a situation that was already indicated in Table 1. Besides, the used data set presented a significant statistical di fference between the two years, where the spatial distribution also became evident in Figure 6. The highest concentrations of TSS in 2016 were in the lower central region of the lake, whereas in 2017, th ey were near the lower-right margin. A large part is found in the center of the lake for 2017 with the minimum TSS values. This difference in spatial distribution, mainly showing as a large TSS concentration in 2016, is consistent with the in situ collected water samples and is also explained by th e rain that occurred in the previous week of the field collection in 2016. Figure 6 also shows that TSS concentrations were in the same range (9.33 to 23.75 mg/L) for both years. In this l characteristics of the prediction, sense, to verify if the statistica data remain close to the observed ones, where Figu re 7 shows the box plot of TSS concentrations for both observed and predicted values in March 2016 and 2017. Box plot of TSS concentrations for observed and predicted values in 2016 and 2017. Figure 7. Figure 7. ed and predicted valu es in 2016 and 2017. Box plot of TSS concentrations for observ From Figure 7, we can see the similarity between the observed and predicted distributions, even though this was not a large sample. From the Wilcoxon test, which is adequate for asymmetric distributions, the null hypothesis of equal medians was not rejected at a 5% significance level -value ( p = 0.74 ). From the same test, the statistically significant di ff erence between the years already p 0.0007). The observed -value = seen for the observed TSS values was maintained for the predicted TSS ( average that was 16.27 and 13.65 (Table 1) became 15.72 and 12.51 for the predicted TSS in March 2016 ed and predicted valu Figure 7. Box plot of TSS concentrations for observ es in 2016 and 2017.

11 Sustainability 2019 11 , 2580 11 of 13 , ffi cient also remained similar, 17.51% and 22.56%, which are close to 19.77% and 2017. The variance coe and 23.65% presented in Table 1. Although the results of this study confirm the viability of the prediction of the concentration of TSS from remote sensing data and ANN, we emphasize that because it is a new methodology and that is still under development it has some limitations that should be considered. For example, since each water body has its own characteristics (hydraulic, physical, chemical, and biological), which are related to its surroundings and the region’s climate, the proposed model in this study was trained relative to these conditions of the study area. Thus, it is necessary to develop regional models adapted to the area of interest of the study. Other authors like Kong et al. [ 8 ] and ff erent regions. Chen et al. [26] also point out the absence of a standard model for di In relation to temporal variation, we point out that the field samplings were carried out in March of the two years and therefore the seasonal variation of TSS (not considered in this study) may indicate ffi that a single model trained with data from only one season is not su cient to predict other values throughout the year. Another factor that stands out is that besides the seasonal variation of TSS, other changes can occur in the natural environments over the years. Once the environmental characteristics ffi are modified, it is not possible to a rm the capability of the trained neural network to predict data in the long term at this time. Thus, the monitoring of TSS from remote sensing does not rule out laboratory analyses from time to time. For instance, if the predictions exhibit unexpected behavior, such as a growth trend, new TSS and spectral data may be collected to check if it is a real change in the TSS or the neural network needs to be updated for current conditions. Finally, although studies as this serve as pilot studies to be applied in larger water bodies, we emphasize that adaptations need to be made for this to occur because when flying with a UAV in lakes and dams because large homogeneous areas makes it di cult to generate orthophotos and products ffi generated from the Structure for Motion (SfM) technique. One of the ways to minimize this problem would be to perform high altitude flights, facilitating the identification of homologous points in the images for generating the orthophoto, but which would result in a loss in resolution of the images. The presented limitations indicate that this research needs to be continued. Nevertheless, what we have demonstrated in this article should instigate replications of this method in other water bodies such that more involved communities benefit from our positive results. This can be done through area flyovers with UAVs with RGB and NIR cameras, correct processing of acquired images, reliable data collection of water quality, and the establishment of an ANN with the ideal parameters for the prediction of interest, which can be TSS as in this study, or for example, chlorophyll-a or organic matter. The prediction of TSS in water bodies from images acquired using a UAV and processed via an ANN should benefit managers, professionals, and researchers linked to the management and control of water resources by presenting a method for the dynamic and spatial monitoring of water quality problems, such as the presence of suspended solids. 4. Conclusions The use of UAVs in the mapping of water quality is shown to be a promising tool because it alleviates issues found in the usual in situ monitoring, such as the insu ciency of data, high time ffi and money costs, and modeling via remote orbital sensing, such as the low spectral and temporal resolutions. Through analysis of the response that the sensor on board the UAV collected in the regions of visible and near infrared, it was possible to model the concentration of optically active compounds, such as suspended solids, and generate maps that allowed for their temporal monitoring and spatial analysis at the study site. We emphasize the applicability of the use of artificial intelligence through artificial neural networks to meet the need for modeling suspended solids in complex aquatic environments, where more simplistic analyses, such as the regression models presented in this study, may not be su ffi cient. The use of an ANN instead of RA significantly improved the quality of the results from the generated 2 models, where R values rose from 0.20 (RA) to 0.75 (ANN).

12 Sustainability 2019 11 , 2580 12 of 13 , However, although the model presented could accurately predict suspended solids concentrations compatible with the statistical features of the in situ observed values, its use was limited only to the study area where the ANN was trained and calibrated, and possible adaptations to it are required for use in other environments. The presented results are important for two main reasons. First, although regression methods have been used in remote sensing applications, they may not be adequate for capturing the linear / or non-linear relationships of interest. Second, they show that the use of UAVs in the mapping of and water bodies together with the application of neural networks in the analysis of the results obtained is a promising approach and has the potential to assist in monitoring the quality of these environments. Thus, we intend to continue monitoring the total suspended solids concentrations in Lake Unisinos by performing new overflights with a UAV in the region and simulating the data collected with the neural network. We also emphasize the need to continue the research in order to improve the generated model, as well as to consider the interference of other optically active compounds, such as chlorophyll and organic matter, in the spectral response of water, and consequently, in the neural network generated. Author Contributions: T.T.G., E.C.K., and D.B. were responsible for collecting and processing the images obtained from the UAV. T.T.G. and E.C.K. were responsible for water collection and for the laboratorial analysis. T.T.G., M.R.V., and L.G.J. implemented the artificial neural network. T.T.G., E.M.S., D.B., and L.G.J. were responsible for analyzing the data. T.T.G., M.R.V., and E.M.S. wrote the paper. M.R.V. and F.F.M. reviewed the paper. All the authors have read and approved the paper final version. M.R.V. and F.F.M. thank the Brazilian Council for Scientific and Technological Development Acknowledgments: / CAPES for the financial support of the MSc scholarship. D.B. (CNPq) for the research grant. T.T.G. thanks DS thanks CAPES for the financial support of the MSc scholarship. Conflicts of Interest: The authors declare no conflict of interest. References Kageyama, Y.; Takahashi, J.; Nishida, M.; Kobori, B.; Nagamoto, D. Analysis of Water Quality in Miharu 1. Dam Reservoir, Japan, using UAV Data. IEEJ Trans. , 11 (Suppl. 1), S183–S185. [CrossRef] 2016 2. Qun, M.; Tan, X.; Liu, Z.; Liu, C.; Li, Q. Monitoring Chlorophyll-a and Suspended Substance in Nansi Lake, China through Remote Sensing Technology. In Proceedings of the 2008 International Workshop on Education Technology and Training 2008 International Workshop on Geoscience and Remote Sensing, Shanghai, China, 21–22 December 2008; Volume 2, pp. 348–351. 3. Campbell, G.; Phinn, S.R.; Dekker, A.G.; Brando, V.E. Remote sensing of water quality in an Australian tropical freshwater impoundment using matrix inversion and MERIS images. 2011 , Remote Sens. Environ. 115 , 2402–2414. [CrossRef] 4. Jensen, J.R. Sensoriamento Remoto do Ambiente: Uma Perspectiva em Recursos Terrestres ; Par ê ntese: S ã o Jos é dos Campos, Brasil, 2011. 5. Modeling Total Suspended Solids (TSS) Concentrations in United States Environmental Protection Agency (EPA). ; O ce of Research and Development—National Health and Environmental E ff ects Research Narragansett Bay ffi Laboratory: Narragansett, RI, USA, 2016. Song, K.; Li, L.; Wang, Z.; Liu, D.; Zhang, B.; Xu, J.; Du, J.; Li, L.; Li, S.; Wang, Y. Retrieval of total suspended 6. matter (TSM) and chlorophyll-a (Chl-a) concentration from remote-sensing data for drinking water resources. Environ. Monit. Assess. 2012 , 184 , 1449–1470. [CrossRef] [PubMed] 7. Wang, J.; Tian, Q. Estimation of total suspended solids concentration by hyperspectral remote sensing in Liaodong Bay. Indian J. Mar. Sci. 2015 , 44 , 1137–1144. Kong, J.; Shan, Z.; Chen, Y.; Yang, J.; Hu, Y.; Wang, L. Assessment of remote-sensing retrieval models for 8. suspended sediment concentration in the Gulf of Bohai. Int. J. Remote Sens. 2018 , 40 , 2324–2432. [CrossRef] 9. Din, E.S.; Zhang, Y.; Suliman, A. Mapping concentrations of surface water quality parameters using a novel remote sensing and artificial intelligence framework. Int. J. Remote Sens. 2017 , 38 , 1023–1042. 10. Amanollahy, J.; Kaboodvandpour, S.; Majidi, H. Evaluating the accuracy of ANN and LR models to estimate the water quality in Zarivar International Wetland, Iran. Nat. Hazards 2017 , 85 , 1511–1527. [CrossRef]

13 Sustainability 2019 11 , 2580 13 of 13 , Breuning, F.M.; Pereira Filho, W.; Galv ã o, L.S.; Wachholz, F.; Cardoso, M.A.G. Dynamics of limnological 11. Sci. parameters in reservoirs: A case study in South Brazil using remote sensing and meteorological data. 574 2017 , 253–263. [CrossRef] [PubMed] , Total Environ. 12. Moridnejad, A.; Abdollahi, H.; Alavipanah, S.K.; Samani, J.M.V.; Moridnejad, O.; Karimi, N. Applying artificial neural networks to estimate suspended sediment concentrations along the southern coast of the , 8 , 891–901. [CrossRef] Arab. J. Geosci. Caspian Sea using MODIS images. 2015 13. Peterson, K.T.; Sagan, V.; Sidike, P.; Cox, A.L.; Martinez, M. Suspended Sediment Concentration Estimation from Landsat Imagery along the Lower Missouri and Middle Mississippi Rivers Using an Extreme Learning Remote Sens. , 10 , 1503. [CrossRef] Machine. 2018 Gago, J.; Douthe, C.; Coopman, R.E.; Gallego, P.P.; Ribas-Carbo, M.; Flexas, J.; Escalona, J.; Medrano, H. UAVs 14. Agric. Water Manag. 2015 , 153 , 9–19. [CrossRef] challenge to assess water stress for sustainable agriculture. C â ndido, A.K.; Paranhos Filho, A.C.; Haupenthal, M.R.; da Silva, N.M.; de Sousa Correa, J.; Ribeiro, M.L. 15. Water Quality and Chlorophyll Measurement Through Vegetation Indices Generated from Orbital and Suborbital Images. 2016 , 227 , 224. [CrossRef] Water. Air. Soil Pollut. Su, T.-C.; Chou, H.-T. Application of Multispectral Sensors Carried on Unmanned Aerial Vehicle (UAV) 16. to Trophic State Mapping of Small Reservoirs: A Case Study of Tain-Pu Reservoir in Kinmen, Taiwan. , 7 , 10078–10097. [CrossRef] Remote Sens. 2015 Guimar ã es, T.; Veronez, M.; Koste, E.; Gonzaga, L.; Bordin, F.; Inocencio, L.; Larocca, A.; de Oliveira, M.; 17. Vitti, D.; Mauad, F. An Alternative Method of Spatial Autocorrelation for Chlorophyll Detection in Water Sustainability 2017 , Bodies Using Remote Sensing. , 416. [CrossRef] 9 18. ã es, T.T.; Koste, E.C.; Silva, J.M.; Souza, L.V.; Oliverio, W.F.M.; Veronez, M.R.; Kupssinskü, L.S.; Guimar É .; Souza, J.G.; et al. Proposal of a Method to Determine the Correlation between Total Jardim, R.S.; Koch, I. Suspended Solids and Dissolved Organic Matter in Water Bodies from Spectral Imaging and Artificial Neural Networks. Sensors 2018 , 18 , 159. [CrossRef] 19. S á enz, N.A.; Paez, D.E.; Arango, C. Local algorithm for monitoring total suspended sediments in micro-watersheds usin drones and remote sensing applications. Case study: Teusac á River, La Calera, Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci.c 2015 XL-1 / W4 , 159–165. , Colombia. â meras de baixo custo acopladas a Roig, H.L.; Ferreira, A.M.R.; Menezes, P.H.B.J.; Marotta, G.S. Uso de c 20. í culos a é reos leves no estudo do aporte de sedimentos no Lago Parano á . In ve ó sio Brasileiro de Anais XVI Simp Sensoriamento Remoto—SBSR ; INPE: Foz do Iguaçu, Brasil, 2013; pp. 9332–9339. Guo, Q.; Wu, X.; Bing, Q.; Pan, Y.; Wang, Z.; Fu, Y.; Wang, D.; Liu, J. Study on Retrieval of Chlorophyll-a 21. Sustainability 2016 , 8 Concentration Based on Landsat OLI Imagery in the Haihe River, China. , 758. [CrossRef] 22. American Public Health Association (APHA). Standard Methods for Examination of Water and Wastewater ; APHA: Washington, DC, USA, 1995. 23. Haykin, S.S. Neural Networks and Learning Machines , 3rd ed.; Pearson: Upper Saddle River, NJ, USA, 2009. 24. Allen, D.; Arthur, S.; Haynes, H.; Olive, V. Multiple rainfall event pollution transport by sustainable drainage Int. J. Environ. Sci. Technol. , 14 , 639–652. [CrossRef] systems: The fate of fine sediment pollution. 2017 25. Wu, J.L.; Ho, C.R.; Huang, C.C.; Srivastav, A.L.; Tzeng, J.H.; Lin, Y.T. Hyperspectral Sensing for Turbid Water Quality Monitoring in Freshwater Rivers: Empirical Relationship between Reflectance and Turbidity and Sensors 2014 , 14 , 22670–22688. [CrossRef] [PubMed] Total Solids. 26. Chen, C.J.; Zhu, W.; Tian, Y.Q.; Yu, Q.; Zheng, Y.; Huang, L. Remote estimation of colored dissolved organic matter and chlorophyll-a in Lake Huron using Sentinel-2 measurements. J. Appl. Remote Sens. 2017 , 11 , 036007. [CrossRef] 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access © article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http: // creativecommons.org ). licenses / by / 4.0 / /

Related documents