Reservoir water quality is important for water quality management downstream. A hierarchical approach is developed to present the monitoring locations within a format that satisfies the objectives of social stakeholders for making final decisions. First, a CE-QUAL-W2 model is applied to simulate water quality variables in the reservoir for a long time using a set of historic data. Second, transinformation entropy theory is used to quantify mutual information among a set of monitoring stations for each water quality variable. Then, a non-dominating sorting genetic algorithm-based model is developed for multi-objective optimization of the water quality monitoring network. Finally, a social choice method is applied to the identified non-dominated solutions to achieve a strategy that is compromised among stakeholders. The variations of the water quality variables at different depths and different seasons are investigated. The proposed approach is illustrated for Karkheh Reservoir in Iran. The number of optimized monitoring stations is the same for all seasons (three out of 22 potential stations) using different social choice methods. The results show the appropriate performance of the proposed methodology for optimization of reservoir water quality monitoring stations.

## INTRODUCTION

Sampling locations are designed to monitor reservoir water quality. The design of sampling locations is difficult due to a wide range of water quality variables that can be used to present water quality, the temporal and spatial characteristics of sampling, and the duration and objectives of sampling (Harmancioglu *et al.* 1999). For optimization of a water quality monitoring program, various regulations and the physical and geographical properties of reservoirs must be considered in formulating this non-linear, complex problem (Behmel *et al.* 2016).

The study of the placement of monitoring stations in a reservoir is limited in the literature, although similar studies can be found for identifying locations to monitor water quality in river and groundwater resources. Lee & Kwon (2009) and Lee *et al.* (2011) proposed methodologies using statistical analysis to reduce redundant sampling locations in a reservoir. Based on information theory, Lee *et al.* (2014) proposed an approach for optimization of water quality monitoring stations in a reservoir. They used water quality data at similar depths of Lake Yongdam. Instead of using an optimization approach, they tested all possible combinations to find the optimal solution among potential combinations of stations. These studies are not suitable for the large reservoirs, at which the potential sampling locations increase significantly. Yenilmez *et al.* (2015) proposed a methodology to cope with this problem by deploying an approach based on kernel density estimation and ordinary kriging to optimize the number of monitoring stations in the Porsuk dam reservoir for a single water quality variable. They used dissolved oxygen (DO) at the surface layer of the reservoir. Their approach neglects the effect of reservoir depth on the water quality.

In general, a series of data can be translated into information using the Shannon theory (Harmancioglu 1981). The entropy quantities include marginal entropy, joint entropy, conditional entropy and transinformation. The use of information theory was explored to assess the groundwater-monitoring network. Mogheir & Singh (2002), Mogheir *et al.* (2009), Masoumi & Kerachian (2010), and Mondal & Singh (2012) evaluated the groundwater-monitoring network using information theory. This theory has also been used to evaluate the effectiveness of monitoring stations in rivers. Ozkul *et al.* (2000), Karamouz *et al.* (2009), Mahjouri & Kerachian (2011), and Memarzadeh *et al.* (2013) employed entropy theory to assess water quality monitoring networks in rivers. To the best of our knowledge, Lee *et al.* (2014) used entropy theory to identify water quality monitoring locations in a reservoir, however, as mentioned earlier, the effect of reservoir depths on the water quality was not considered. They modeled water quality by observing water quality data that was collated at nine locations at the surface of reservoir. These studies considered the technical aspects of reservoir water quality and failed to include stakeholders and decision makers to evaluate the plausibility and practicality of a design to monitor reservoir water quality.

Stakeholder engagement can lead to more sustainability for a water resources system as they can input their expert knowledge during the design process (Loucks *et al.* 2005). To address the choices of decision makers, which are most likely conflicting, a set of equations is formulated to mathematically capture each choice as an objective. Each equation determines a way to evaluate the goodness of a solution based on the stakeholder's choice. Using a heuristics approach, a set of non-dominated solutions is identified to represent the relationship among objectives. Each solution from the non-dominated front provides the best for an objective given the constraints of the other objectives. To find the best solution among a set of non-dominated solutions, a social method can be used to rank solutions based on the preferences of decision makers. Social choice theory is a useful method to encompass the preferences of engagement stakeholders over the available options (Sheikhmohammady & Madani 2008).

In this study, a new methodology is developed to identify a set of locations to monitor water quality in a reservoir. The methodology contains two main steps. First a non-dominated sorting genetic algorithm (NSGA-II)-based optimization model is applied to find a set of non-dominated solutions. Second, the social choice theory-based approach is used to identify a solution with the highest rank based on the preferences of participating stakeholders. The linkage between the technical aspects of this problem with the social aspects is the contribution of this study. This study introduces and evaluates a new approach to include the decision-makers’ cooperation in the design process. In addition, this study considers the effect of reservoir depth on water quality. The presented optimization model is employed to identify monitoring locations for water quality sampling from the reservoir of the Karkheh Dam in Iran.

## MODEL FRAMEWORK

*et al.*2008), this model is able to simulate the water quality of reservoirs with an accuracy that is comparable to the WASP model (Supplementary Material Table S1, available with the online version of this paper). The geometric data, meteorological parameters, hydraulic coefficients, inflow/outflow data, and flow sources and sinks are inputs into the CE-QUAL-W2 model. Once the CE-QUAL-W2 model is calibrated and validated using the historic dataset, it is used to simulate reservoir water quality for each water quality variable. The transinformation entropy quantity (Te (

*x,y*)) is calculated for each pair of potential monitoring stations, represented as station

*x*and station

*y*as follows:

where *p*(*x _{i}*) and

*p*(

*y*) are the occurrence probability of

_{j}*x*and

_{i}*y*

_{j}_{,}respectively, and

*p*(

*x*,

_{i}*y*) is the joint probability between

_{j}*x*and

_{i}*y*(Mogheir

_{j}*et al.*2004a, 2004b). A

*T–D*curve is obtained by plotting the transinformation quantities (

*T*) with respect to the spatial distances (

*D*) among each pair of monitoring stations.

### Stakeholders

Three stakeholders, i.e. the Ministry of Energy (MOE), the Department of Environment (DOE), and the Regional Water Authority (RWA), are involved in the operation of a reservoir in Iran. The MOE provides financial support and balances the budget while the DOE is responsible for water quality assessment and the RWA aims to increase the efficiency of the monitoring network by minimizing the redundant information among stations. The objectives of these stakeholders are summarized as follows:

The MOE objective is to minimize the cost.

The DOE objective is to maximize the spatial coverage of the water quality monitoring network.

The RWA objective is to minimize the redundant information among water quality monitoring stations.

Considering the utilities of these stakeholders and the *T–D* curves, a multi-objective optimization model is formulated and solved by a conventional heuristic algorithm to find a set of non-dominated solutions. Due to the complexity of this problem and the large size of the decision space, the multi-objective algorithm likely converges to non-dominated solutions instead of a true Pareto-front, therefore, the non-dominated term is used instead of the Pareto-front. In this study, the formulated problem is solved using NSGA-II (Deb *et al.* 2002).

A preference matrix is developed by ranking *m* non-dominated solutions based on the choice of three stakeholders. The preference matrix is a 3**m* matrix as demonstrated in Table 1, in which *m* is the number of optimal solutions. Finally, the best solution is determined by the social choice method which selects a solution based on its popularity among the stakeholders.

. | 1 2 3………………………….. m
. |
---|---|

MOE | Each of m solutions in order of priority for MOE |

DOE | Each of m solutions in order of priority for DOE |

RWA | Each of m solutions in order of priority for RWA |

. | 1 2 3………………………….. m
. |
---|---|

MOE | Each of m solutions in order of priority for MOE |

DOE | Each of m solutions in order of priority for DOE |

RWA | Each of m solutions in order of priority for RWA |

### Multi-objective optimization model: NSGA-II

*y*_{1},*y*_{2}, &*y*_{3}Normalized objectives between 0 and 1,

*n*_{opt}Number of optimized stations,

*n*_{p}Number of potential stations,

*n*_{min}Minimum number of stations,

*Q*Number of water quality variables,

*γ*_{q}Importance coefficient for a water quality variable

*q*in a monitoring program,*q*= 1,..,*Q,**c*_{i}Binary variable (if there is a monitoring station in potential station

*i, c*is 1, otherwise, it is 0),_{i}*Te*_{i,q}Transinformation entropy (

*Te*in Equation (1)) associated with the distance*d*in station_{i}*i*and water quality variable*q*,*Te*_{min,q}Minimum transinformation entropy quantity for water quality variable

*q,**Te*_{max,q}Maximum transinformation entropy quantity for water quality variable

*q*,*D*_{max}Maximum value of the distances, taken pair-wise, among all pairs of the potential stations (in kilometers),

*D*_{i}Distance between station

*i*and the closest station (in kilometers),*D*_{opt,q}Optimized distance associated with

*Te*_{min,q}for water quality variable*q*(in kilometers).

### Social choice theory

Social choice theory considers the involvement of stakeholders with different preferences over the available options (solutions obtained from NSGA-II) in the process of decision-making. The common social choice methods are as follows (Sheikhmohammady & Madani 2008).

#### Condorcet choice

This method elects an option that is preferred more times to the other options by the stakeholders with pairwise comparisons (McLean 1990; Shalikaran *et al.* 2011).

#### Borda count method

A score is assigned to each option based on the stakeholder preference. The solutions are ranked based on their preference. The highest score is *n* (the count of non-dominated solutions) for the most preferable option and this score decreases incrementally all the way to 1 for the least preferable option. A total score is calculated by summing scores across the stakeholders. The option that has the highest total score is selected as the compliant option for the stakeholders (McLean 1990; Shalikaran *et al.* 2011).

#### The plurality rule

The plurality rule selects the option that receives the most votes by the stakeholders in the first level of preference (Sheikhmohammady & Madani 2008; Shalikaran *et al.* 2011).

#### Majority voting rule

The majority voting rule (MVR) is a decision rule that selects options that have the majority votes (more than half) at the highest possible level of preference. In this way, only a majority is important, not the number of votes (Bassett & Persky 1999; Shalikaran *et al.* 2011).

More details about the social choice methods and their applications are shown in the supplementary material (Supplementary Material Example S1, Tables S2 to S4, available with the online version of this paper).

## CASE STUDY

*γ*in Equations (3) and (4)) are shown in Table 2. Each coefficient prioritizes a water quality variable in the quality monitoring program. Once the CE-QUAL-W2 model is calibrated and validated using the historic dataset, it is used to simulate reservoir water quality for 40 years in the period of 1968–2008.

_{q}Variables . | Aquatic life and fisheries . | Drinking water . | Irrigation . | Power generation . | Importance coefficient . |
---|---|---|---|---|---|

TDS | * | * | *** | *** | 0.30=8/27 |

TP | *** | *** | 0.22=6/27 | ||

TN | * | *** | 0.15=4/27 | ||

DO | *** | * | * | * | 0.22=6/27 |

Temperature | *** | 0.11=3/27 |

Variables . | Aquatic life and fisheries . | Drinking water . | Irrigation . | Power generation . | Importance coefficient . |
---|---|---|---|---|---|

TDS | * | * | *** | *** | 0.30=8/27 |

TP | *** | *** | 0.22=6/27 | ||

TN | * | *** | 0.15=4/27 | ||

DO | *** | * | * | * | 0.22=6/27 |

Temperature | *** | 0.11=3/27 |

Low to high importance is represented by * to ***.

## RESULTS AND DISCUSSION

*T*) are plotted versus the spatial distances (

*D*) among monitoring stations to obtain the

*T–D*curve and determine the mutual information for each variable across seasons. The

*T–D*curves that are obtained for TDS for each season are demonstrated in Figure 3. As shown in this figure, the redundant or mutual information (transinformation) among monitoring stations decreases as the distance between two monitoring stations increases. The

*D*

_{opt}(the optimal distance) is selected when the transinformation entropy reaches an equilibrium (insignificant changes when distance increases). If the spatial distances among monitoring stations are less than

*D*

_{opt}, there is mutual or redundant information among monitoring stations. For spatial distances more than

*D*

_{opt}, the spatial coverage of the water quality monitoring network is not enough. The distance

*D*

_{opt}corresponds to the minimum value of transinformation on the

*T–D*curve (the dashed line in Figure 3). The optimal distance is approximately 18 km for TDS for all seasons. The optimal distance is between 17 and 18 km for each water quality variable across all seasons. Due to the space available in this paper, the minimum distance (17 km) is defined as the optimal distance for all water quality variables and used in Equations (2)–(7) (Supplementary Material Figures S2 to S5, available with the online version of this paper).

Considering the *T–D* curves and the objectives of the stakeholders, the multi-objective optimization model (described above) is characterized and solved by the NSGA-II. The number of binary decision variables is 22 (equal to 22 potential stations). The NSGA-II multi-objective optimization algorithm is tested for various population and generation numbers. After an experiment for the size of population and generation, a population size of 330 (15 × 22 = 330) individuals is selected when the NSGA-II algorithm is stopped after 200 generations.

By obtaining the non-dominated solutions, the social choice methods are applied and the preference of the solutions from each of the three stakeholders. The results of the compromise solution based on different social choice methods are presented in Table 3. As shown in this table, the number of optimized monitoring stations is the same (three out of 22 potential stations) across all seasons using the four social choice methods (described above).

Method . | Season . | Number of optimized stations . | Optimal locations . |
---|---|---|---|

Condorcet, Borda, Plurality rule, and MVR | Spring | 3 | (64,27) (65,45) (59,45) |

Summer | 3 | (57,33) (62,39) (50,39) | |

Fall | 3 | (47,21) (52,27) (50,39) | |

Winter | 3 | (64,27) (50,39) (64,51) |

Method . | Season . | Number of optimized stations . | Optimal locations . |
---|---|---|---|

Condorcet, Borda, Plurality rule, and MVR | Spring | 3 | (64,27) (65,45) (59,45) |

Summer | 3 | (57,33) (62,39) (50,39) | |

Fall | 3 | (47,21) (52,27) (50,39) | |

Winter | 3 | (64,27) (50,39) (64,51) |

## SUMMARY AND CONCLUSION

In this study, a new methodology based on the CE-QUAL-W2 model, NSGA-II optimization method, transinformation entropy theory, and social choice theory is proposed for optimization of water quality monitoring stations of dam reservoirs. The transinformation entropy quantities and the *T–D* curves are obtained to determine the mutual information among monitoring stations. Considering the utilities of the existing stakeholders, a multi-objective NSGA-II optimization model is applied to find Pareto-optimal solutions. Then, the best solution is determined based on social choice methods to achieve a common option that is agreed among all stakeholders. The variations of water quality variables in different seasons are investigated. The potential monitoring stations are located at different depths along the length of the reservoir. The proposed approach is applied for optimization of water quality monitoring stations for the Karkheh reservoir in Iran. The optimal distance among stations is considered to be 17 km. The number of optimized monitoring locations is the same (three out of 22 potential stations) across all seasons using different social choice methods, however, these locations vary across seasons. The results show that NSGA-II is able to converge to a set of non-dominated solutions and social choice methods are effective ways for finding a compromise solution.

In future work, the temporal frequency of monitoring locations can be studied, and also, a spatial variogram can be used to determine the nugget (minimum value of mutual information and its corresponding distance) instead of transinformation entropy. The effect of a more accurate water quality model such as WASP should also be examined.