 Review
 Open access
 Published:
Copula–entropy theory for multivariate stochastic modeling in water engineering
Geoscience Letters volume 5, Article number: 6 (2018)
Abstract
The copula–entropy theory combines the entropy theory and the copula theory. The entropy theory has been extensively applied to derive the most probable univariate distribution subject to specified constraints by applying the principle of maximum entropy. With the flexibility to model nonlinear dependence structure, parametric copulas (e.g., Archimedean, extreme value, metaelliptical, etc.) have been applied to multivariate modeling in water engineering. This study evaluates the copula–entropy theory using a sample dataset with known population information and a flood dataset from the experimental watershed at the Walnut Gulch, Arizona. The study finds the following: (1) both univariate and joint distributions can be derived using the entropy theory. (2) The parametric copula fits the true copula better using empirical marginals than using fitted parametric/entropybased marginals. This suggests that marginals and copula may be identified separately in which the copula is investigated with empirical marginals. (3) For a given set of constraints, the most entropic canonical copula (MECC) is unique and independent of the marginals. This allows the universal solution for the proposed analysis. (4) The MECC successfully models the joint distribution of bivariate random variables. (5) Using the “AND” case return period analysis as an example, the derived MECC captures the change of return period resulting from different marginals.
Introduction
A multitude of processes in water engineering involve more than one random variable. For example, floods are characterized by peak, duration, volume, and interarrival time, which are all random in nature. Droughts are described by their severity, duration, interarrival time, and areal extent, which are also random. Extreme precipitation events are represented by their intensity, amount, duration, and interarrival time, which are all random. Interbasin water transfer involves transfer of excess water from one basin (say, donor) to a water deficient basin (say, recipient). The transfer involves the volume of water, availability of water in both donor and recipient basins, duration of transfer, rate of transfer, and time interval between water transfers which are all random variables. Water quality entails pollutant load, duration for which the load is higher than the protection limits, and peak pollutant concentration, which are all random variables. Likewise, erosion in a basin may be characterized by sediment yield, number of erosion events, duration of events, intensity of events, and time interval between two consecutive events. These are all random variables. Flooding in a coastal watershed may be caused by the simultaneous occurrence of high precipitation and high tides where both precipitation and tide are random variables. Examples of processes involving more than one random variable abound in hydrologic, hydraulic, environmental, and water resources engineering. There usually exists some degree of dependence among the random variables or at least among some of the variables. Often we are concerned with multivariate stochastic modeling and risk analysis of the systems and processes that involve the derivation of probability distributions of the random variables considering the dependence structure among them. Nowadays, these stochastic processes can be modeled with the copula–entropy theory that has proven to be more flexible and accurate than the traditional approaches. The objective of this paper therefore is to reflect on some recent advances made in the application of the copula–entropy theory and future challenges.
Methods
Copula–entropy theory
The copula–entropy theory (CET) is an amalgam of the copula theory and the entropy theory. These two theories are now discussed.
Entropy theory
The entropy theory comprises (1) formulation of entropy, (2) principle of maximum entropy (POME), and (3) theorem of concentration (TOC). Entropy can be defined in the real domain or frequency domain. In the real domain, the most famous form of entropy is the Shannon entropy (Shannon 1948), although Tsallis entropy (Tsallis 1988) and Renyi entropy (Renyi 1951) have been receiving much attention in recent years. Another popular formulation of entropy is the crossentropy or relative entropy due to Kullback and Leibler (1951) which is a generalization of the Shannon entropy. For a continuous random variable X with a probability density function (PDF), f(x), and cumulative probability distribution function (CDF), F(x), the Shannon entropy, H(X) or H[f(x)], can be defined as
where X ∊ [0,∞] but can also vary from − ∞ to + ∞ or from a finite lower limit to a finite upper limit. The Shannon entropy can also be defined in an analogous manner for a discrete variable.
The principle of maximum entropy (POME), propounded by Jaynes (1957), states that of all the distributions that satisfy the given constraints, the distribution yielding the maximum entropy is the leastbiased distribution and should hence be preferred. If there are no constraints then POME says that the resulting distribution would be a uniform distribution, which is consistent with the Laplacian principle of insufficient reason.
The theorem of concentration states that POME yields the best constrained probability distribution and is the preferred method for inferring this distribution, and this distribution best represents our state of knowledge about the behavior of the system. This is a consequence of Shannon’s inequality and the relation between entropy and Chi square statistic.
Copula theory
The foundation of the copula theory is the Sklar theorem (Sklar 1959). The theorem states that the joint (multivariate) probability distribution of two or more random variables is a function of the probability distributions of individual variables (also referred to as marginal distributions which are onedimensional). In other words, the multivariate distribution is coupled to its marginal distributions. It is implied that these random variables are not independent of each other. The copula theory does not specify the way to derive the marginal distributions and does not lead to a unique copula. There are different ways to construct copulas and different ways to select the best copula.
Methodology for application of copula–entropy theory
The copula–entropy theory can be applied in different ways: (1) the marginal distributions are derived using the entropy theory and the joint distribution using the copula theory (e.g., Hao and Singh 2012; Zhang and Singh 2012). Since there can be more than one joint distribution fitted to the multivariate random variables, the best distribution is then selected from either visual goodnessoffit plot (e.g. Q–Q plot) or formal goodnessoffit test statistics (Genest et al. 2009). (2) With the marginal distributions derived using the entropy theory, the best copula is selected as the copula function yielding the maximum entropy. (3) Both marginal and joint distributions are derived using the entropy theory (e.g., Chu 2011; Chen et al. 2013; Aghakouchak 2014). The methodology for application of the copula–entropy theory will depend on the way it is applied. Each of the three ways is now outlined. First, the methodology for application of the entropy theory is outlined, since entropy is needed in all three ways.
Methodology for application of entropy theory
Fundamental to applying the entropy theory is the specification of constraints the derived probability distribution must satisfy. There can be any number of constraints which can be defined in different ways but the easiest way is to define them in terms of moments. Let g(x) be any function of random variable X. Then, the ith constraint, C_{ i }, can be expressed as
where E is the expectation operator. If \( g_{0} \left( x \right) = 1, \) then Eq. (1) will lead to the total probability as
The next step is to maximize entropy given by Eq. (1), subject to Eqs. (2) and (3). Entropy maximizing can be done using the method of Lagrange multipliers where the Lagrange function L can be written as
where \( \lambda_{i} ,i = 0,1, \ldots ,m, \) are the unknown Lagrange multipliers. Applying the Lagrange–Euler calculus of variation, Eq. (4) leads to the maximum entropy distribution:
Now the unknown Lagrange multipliers are determined from the known constraints. The multipliers can be determined in two ways: regular entropy method and parameter space expansion method (Singh 1998; Singh and Rajagopal 1986). Substituting Eq. (5) in Eq. (3), we get
where Z is called the partition function, and
Equation (6) shows that λ_{0} is a function of \( \lambda_{ 1} , \, \lambda_{ 2} , \ldots , \, \lambda_{m} , \) i.e., \( \lambda_{0} = \lambda_{0} \left( {\lambda_{ 1} , \, \lambda_{ 2} , \ldots , \, \lambda_{m} } \right) \), and the function is convex. Differentiating λ_{0} with respect to \( \lambda_{ 1} , \, \lambda_{ 2} , \ldots , \, \lambda_{m} \) individually, we get the relations between Lagrange multipliers. Substituting Eq. (6) into Eq. (7), we obtain
Equation (8) shows that C_{ i }, \( i = 1,{ 2}, \ldots ,m, \) are functions of \( \lambda_{ 1} , \, \lambda_{ 2} , \ldots , \, \lambda_{m} \).
Differentiating Eq. (6) and using Eqs. (2) and (5), the result is as follows:
For obtaining parameters, the derivatives in Eq. (9) are equated to the derivatives obtained from \( \lambda_{0} = \lambda_{0} \left( {\lambda_{ 1} , \, \lambda_{ 2} , \ldots , \, \lambda_{m} } \right) \). Similarly, it can be shown that
and
The maximum entropy, H_{max}, of the derived POMEbased PDF can be expressed as
Equation (12) shows that maximum entropy is a function of Lagrange multipliers and constraints, such that H_{max} is a concave function. Equation (12) also shows that Lagrange multipliers, \( \lambda_{ 1} , \, \lambda_{ 2} , \ldots ,\lambda_{m} , \) are partial derivatives of H_{max} with respect to constraints C_{ i }, \( i = 1,{ 2}, \ldots ,m, \) respectively.
If q_{ i } and p_{ i }, \( i = 1,{ 2}, \ldots ,n, \) are the frequencies computed from POMEbased and given fitted parametric distributions, respectively, for n class intervals, then we have
where χ^{2} is Chi square distributed with s degrees of freedom as
With the Chi square distribution as the limiting distribution, it is shown that 2NΔH is Chi square distributed. Hence, the Chi square statistic may be applied to assess if the fitted parametric distribution is close to the POMEbased distribution (i.e., the reference distribution of random variable).
Methodology for application of copula theory
Definition and main properties for copula
As stated by Sklar (1959), copula couples the multivariate distribution to its marginal distributions which are uniformly distributed on [0,1]. In other words, copula is a mapping function as \( \left[ {0, 1} \right]^{d} \to [0, 1] \). For ddimensional continuous random variables, there is a unique copula function (C) to represent the joint distribution function (H) as
As shown in Eq. (15), \( u_{i} \) is the CDF of random variable X_{ i }. Representing the joint distribution, the copula function has the following properties:

1.
\( 0\; \le \;C(u_{1} , \ldots ,u_{d} )\; \le \;1 \);

2.
if any \( u_{i} = 0 \), then \( C(u_{1} , \ldots ,u_{d} ) = 0 \);

3.
if all \( u_{j} = 1, \, j = 1, \ldots ,d{\text{ and }}j \ne i \); then \( C(1, \ldots ,u_{i} , \ldots ,1) = u_{i} \);

4.
C is bounded by the Fréchet–Hoeffding bounds as
$$ {W \le C \le M;} \, \quad W = \hbox{max} \,\left( {1  d + \sum\limits_{i = 1}^{d} {u_{i} } ,0} \right), \, M = \hbox{min} \,(u_{1} , \ldots ,u_{d} ) $$(16)In Eq. (16), W represents the perfectly negative dependence, while M represents the perfect positive dependence. For independent random variables, the corresponding copula function is simply given as \( \varPi = u_{1} u_{2} \cdot \cdot \cdot u_{d} = F_{1} (x_{1} )F_{2} (x_{2} ) \cdots F_{d} (x_{d} ) \); and

5.
C is dincreasing, that is, the \( C\left( {u_{ 1} , \ldots , u_{d} } \right) \) volume for any given ddimensional interval is nonnegative.
Copula families and parameter estimation
The major copula families are Archimedean copulas, metaelliptical copulas, extreme value copulas, vine copulas, and entropic copulas. The Archimedean copula (2dimensional) is symmetric and easy to construct through the generating function as
where ϕ is the generating function which is nonincreasing. Based on the choice of Archimedean copulas, different copulas within the family may cover different ranges of dependence (Nelsen 2006). For example, the Gumbel–Hougaard copula may only model the positive dependence, while Frank copula may model the entire range of dependence structure. Given its easy construction, the Archimedean copulas have been extensively applied in bivariate hydrological frequency analysis (e.g., Sraj et al. 2015; Salvadori and Michele 2015; Requena et al. 2016a, b).
Metaelliptical copulas (Fang et al. 2002), as the name suggests, is derived from the elliptical joint distribution. The popularly applied metaelliptical copulas are metaGaussian and metaStudent t copulas. Unlike the Archimedean copulas, the metaelliptical copulas can model the entire range of dependence structure and can be easily applied to highdimensional multivariate modeling. Comparing the two popularly applied metaelliptical copulas, there exists the symmetric tail dependence for metaStudent t copula, while no tail dependence exists for metaGaussian copula (e.g. Genest et al. 2007; Song and Singh 2010).
The extreme value copula is derived in accordance with the extreme value theory which may be applied to model the rare events. As stated by Gudendorf and Segers (2009) and Joe (2014), the following relation exists:
In Eq. (18), C denotes the extreme value copula, and C_{ F } denotes that the copula fulfills the limiting relation.
In other words, the extreme value copula must be maxstable. For the bivariate case, the extreme value copula may be written as
In Eq. (19), A denotes the Pickands dependence function (Pickands 1981; Falk and Reiss 2005) that is convex as \( {A:}\;[0,1] \to [1/2, 1 ] {\text{ and max}}\,(t, \, 1  t) \le A(t) \le 1{\text{ for }}t \in [0, 1 ] \).
The Gumbel–Hougaard copula (Archimedean copula family) is the only Archimedean copula that belongs to the extreme value family. Hence, the Gumbel–Hougaard copula has been popularly applied in bivariate flood frequency analysis, storm analysis, drought analysis, etc.
Vine copula is constructed, based on the probability density decomposition. The vine copula is applied for highdimensional analysis (i.e., d ≥ 3). It is usually categorized into Canonical (C)Vine copula, DVine copula, and Regular RVine copula (Aas et al. 2007). Using 3dimensional analysis as an example, we can write the joint probability density function as
In Eq. (20), c denotes the copula density function. As seen in Eq. (20), the vine copula is very flexible, since the bivariate copula is applied at all the levels. The vine copula has also been applied in highdimensional hydrological frequency analysis (e.g., Pham et al. 2016; Arya and Zhang 2017; Verneiuwe et al. 2015)
The parameters of the parametric copula functions constructed above may be estimated with one of the following three approaches:

(i)
FullMaximum Likelihood Estimation (FullMLE): In this method, the parameters of the marginal distributions and copula functions are estimated simultaneously.

(ii)
TwoStage Maximum Likelihood Estimation (TwoStage MLE): In this method, one first estimates the parameters of marginal distributions and then the parameters of the copula function are estimated using MLE with the marginals computed from the previously fitted marginal distributions.

(iii)
SemiParametric (or Pseudo) Maximum Likelihood Estimation (PseudoMLE): In this method, the parameters of the copula function are estimated from the empirical marginals (i.e., empirical CDF computed from the plotting position formula or kernel density function).
Of the three estimation methods for parametric copula functions, the PseudoMLE is considered least impacted by the possible misidentification of marginal distributions. The advantage of PseudoMLE is the separate parameter estimation of marginal distributions and the copula function.
The most entropic canonical copula may be derived using the entropy theory, similar to the application of entropy theory to the univariate random variables. The Shannon entropy of the copula function for two variables is written as
and the joint density function is given through the copula function as
Substituting Eq. (22) into Eq. (21), one may conclude, with some simple algebra, that the negative copula entropy [i.e., Eq. (21)] denotes the mutual information of random variables X and Y through the Kullback–Leibler crossentropy as
According to the information theory, the mutual information [i.e., \( I({X{;}Y}) \)] is a measure of the total correlation between random variables, that is, the mutual dependence between random variables X and Y. From the copula theory [e.g., Eq. (22) for bivariate random variables], the copula density [i.e., \( c(u,v) \)] also denotes the mutual dependence between variables X and Y. Thus, the information maintained in the copula function is the mutual information (i.e., total correlation) between X and Y which results in the copula entropy being negative. In other words, a higher absolute value of the copula entropy represents higher mutual dependence (or total correlation) among the random variables.
Similar to the POMEbased univariate distribution, the common constraints are the constraints of total probability of marginals (i.e., for uniform distributed variable on [0,1]), and a measure of dependence (also called association):
Applying \( f(x) = \iint {f(x,y)\,{\text{d}}y} \), we can evaluate Eq. (25a) as
In Eq. (25b), \( f(u) = 1 \) since u ~ uniform (0,1). Similarly,
In Eq. (27), Spearman’s rho is commonly applied as the constraint to measure the dependence with \( a_{j} (u,v) = uv \Rightarrow E(uv) = \frac{{\rho_{s} + 3}}{12} \). One can also apply other dependence measures discussed in Nelsen (2006) and Chu (2011).
Using the constraints [Eqs. (24)–(27)], the Lagrange function for the most entropic canonical copula (MECC) can be written as
In Eq. (28), \( \lambda_{0} , \ldots ,\lambda_{n} ,\gamma_{1} , \ldots ,\gamma_{n} ,\lambda_{n + 1} , \ldots ,\lambda_{n + k} \) are the Lagrange multipliers. More specifically for MECC, \( \lambda_{r} = \gamma_{r} , \, r = 1, \ldots ,n \). The Lagrange multipliers \( \lambda_{n + 1} , \ldots ,\lambda_{n + k} \) are pertaining to the constraints in relation to the rankbased dependence measures.
Differentiating Eq. (28) with respect to \( c(u,v) \), we have
Based on the principle of maximum entropy, maximizing Eq. (21) is equivalent to minimizing the objective function
In Eq. (30), \( \varLambda = \left[ {\lambda_{1} , \ldots ,\lambda_{n} ,\gamma_{1} , \ldots ,\gamma_{n} ,\lambda_{n + 1} , \ldots ,\lambda_{n + k} } \right] \).
The most entropic canonical copula (MECC) may be generalized to most entropic copula (MEC) with respect to a given parametric copula (Chu 2011). In the case of MEC, Eqs. (29)–(30) can be rewritten as
In Eq. (31), b is a generic constant, \( \tilde{c}(u,v) \) is the given reference copula. It is seen that the MECC is obtained by setting b = 0. In what follows, we will focus on the application of MECC for bivariate cases through examples.
Copula–entropy for multivariate modeling
Following the discussion of Shannon entropy and copula theory in the previous sections, we will outline the copula–entropy theory for stochastic modeling in this section. In general, we can apply the copula–entropy theory in three ways:

(i)
The marginal distributions are derived using the entropy theory, while the joint distribution (i.e., copula function) is modeled through the parametric copula function with its parameter estimated using the FullMLE, TwoStage MLE, or PseudoMLE. In this approach, the goodnessoffit of the copula function may be assessed either graphically through the K–K plot or statistically with the formal goodnessoffit test statistics (Genest et al. 2009).

(ii)
The difference of this second approach from (i) above is that the parametric copula function is selected such that it yields the maximum entropy among all copula candidates.

(iii)
The approach (iii) takes full advantage of the entropy theory. Both marginal and joint distributions are derived using the entropy theory. The Lagrange multipliers are estimated by maximizing entropy or minimizing the corresponding objective function which is the dual problem of maximizing entropy. The Lagrange multipliers of the MECC (joint distribution) may be optimized from the fitted POMEbased marginal distributions or from the empirical marginal distribution. The approach (iii) is further adopted for the applications.
Application to multivariate data of known population
Here, we will first show the application of copula–entropy theory to the bivariate sample dataset with the known true population. In this sample study, the sample dataset (N = 1000) is generated from the known Gumbel–Hougaard copula (θ = 4.5) with the true marginal distributions:
Study of univariate variates
In Singh (1998), it was shown that \( E[X], \;{\text{and}} \;E[\ln (X)] \) should be applied as constraints to derive the POMEbased gamma distribution; while \( E\left[ {\ln \left( x \right)} \right] \;{\text{and}}\; E\left[ {\left( {\ln x} \right)^{2} } \right] \) are the constraints to derive the POMEbased lognormal distribution. Following Singh (1998), we have the following:
Gamma distribution
The POMEbased gamma distribution may be written as
The relation of Lagrange multipliers to the parameters of gamma distribution (Singh 1998) is given as
Lognormal distribution
The POMEbased lognormal distribution may be written as
In Eq. (33d), y = ln (x) and s ^{2}_{ y } represents the sample variance of y.
Using the bivariate data sampled from the true population, Table 1 lists the Lagrange parameters that are estimated for the univariate variables based on both sample moments and population moments.
Besides applying the constraints directly related to the parametric distribution that may be fitted to the observed dataset, one may also directly apply the first three or four monocentral moments [i.e., \( E(X), \, E(X^{2} ), \, E(X^{3} ), \, E(X^{4} ) \)], given that the moments about the origin govern the shape and mode of the univariate probability density functions (Zellner and Highfield 1988; Cobb et al. 1983). The POMEbased distribution so derived is given as
The objective function is written as
To avoid the possible integration problem, the univariate variable is commonly scaled to [0,1] or [− 1,1] (Hao and Singh 2012; Zhang and Singh 2014). In this study, the univariate variables are scaled to [0,1] to assess its appropriateness. The scaled variable x_{ s } is given as
In Eq. (36), d is a small number such that the scaled variable will not reach either the lower limit or the upper limit, i.e., avoiding P(X ≤ max (x)) = 1. Here, we chose d = 0.01. Equations (34)–(35) may then be reorganized with the use of the scaled variable as
Now applying the first four moments about origin to the scaled variable X and the first three moments about origin to the scaled variable Y, Table 2 lists the parameters estimated by optimizing the objective function using the first four moments about origin [i.e., Eq. (38)].
The POMEbased distributions for the original variables are expressed as
Furthermore, Fig. 1 plots the relative frequency and the frequency computed from the POMEbased distributions. As shown in Fig. 1, the POMEbased univariate distributions derived (using the constraints pertaining to certain population, and first four moments about the origin) visually fit the observed data very well. Using the true population from the reference distribution, Table 3 lists the Chi square test for the fitted parametric and POMEbased distributions constructed. Results in Table 3 clearly indicate the POMEderived distributions may be applied to model the univariate variables. Thus, it is safe to conclude that one may directly use the moments about origin as the constraints to model the univariate random variables.
Study of dependence
As previously discussed, one may apply three different approaches to study the dependence using the copula–entropy theory. Hereafter, each approach is evaluated. Within the objective of the study, the Gumbel–Hougaard, Clayton, Frank and metat copulas (Nelsen 2006) were applied as parametric copulas. The MECC copula was derived with the constraints of \( E\left( U \right),E\left( {U^{2} } \right),E\left( V \right),E\left( {V^{2} } \right) \) and \( E\left( {UV} \right) \). According to the discussion in “Univariate analysis of peak discharge and flood volume” section for univariate analysis, we will simply apply the POMEbased distribution derived using the moments about the origin with the use of scaled variables.
POMEbased marginals with parametric copulas
In this approach, the copula parameters were estimated with the use of POMEbased marginals and by maximizing the loglikelihood function (it may be also called TwoStage MLE). Table 4 lists the estimated parameters as well as the corresponding loglikelihood. In this approach, the copula yielding the largest loglikelihood was selected for further analysis. As seen in Table 4, the Gumbel–Hougaard copula was the best candidate. It was in agreement with the sample data actually generated from the Gumbel–Hougaard copula discussed earlier in the section.
POMEbased marginals with parametric copulas selected based on the entropy
In this approach, the parameters of copulas again were estimated by TwoStage MLE. The difference is that the copula–entropy was computed for the fitted parametric copula. Here, the copula–entropy was estimated using
The computed entropy is also listed in Table 4. From the computed entropy using Eq. (40), it is seen that the Gumbel–Hougaard copula yielded the highest mutual information (the absolute value of the copula entropy) among all the copula candidates.
Parametric copulas estimated using PseudoMLE
In this approach, the parameters of the copula were directly estimated using the empirical distribution (e.g., empirical distribution using the Weibull plotting position formula) which is listed in Table 4. It is seen that with the PseudoMLE, the Gumbel–Hougaard copula again yielded the largest MLE and the highest mutual information.
Most entropic canonical copula with POMEbased marginals (or empirical marginals)
In this approach, copulas were derived from the entropy theory with the constraints of \( E(U) = E(V) = \frac{1}{2}, \)\( \, E(U^{2} ) = E(V^{2} ) = \frac{1}{3}; \, \quad \rho_{\text{spearman}} = 0.9287 \) (sample Spearman’s rho) with the parameters listed in Table 5. In Eqs. (24)–(31), it is seen that the dependence structure (i.e., Spearman’s rho in this sample study) was the controlling factor to optimize the objective function of MECC. Thus, it did not matter how the marginals were handled, the MECC would not change for given Spearman’s rho which is shown in Table 5. To further evaluate the impact of Spearman’s rho correlation coefficient, we changed Spearman’s rho to population Spearman’s rho as the constraint (i.e., ρ = 0.9298 from the true Gumbel–Hougaard copula). As seen in Table 5, there was a significant difference in the Lagrange multipliers estimated for MECC.
To assess whether the MECC so derived fulfilled the fundamental properties of copula function:
Figure 2 compares the marginal variables computed using Eq. (41) from the MECC with both empirical and POMEbased univariate marginals. As seen in Fig. 2, C(u,1) and C(1,v) were in good agreement with their empirical and POMEbased univariate marginals. This also implied the appropriateness of POMEbased univariate marginals derived using the first four and three noncentral moments for random variables X and Y, respectively.
With the fundamental properties fulfilled with the use of MECC derived, one may want to further evaluate the goodnessoffit of the derived MECC. Here, the Kendall distribution plots were generated and compared to the Kendall distribution of the underlying true Gumbel–Hougaard copula:
The Kendall distribution [\( K_{\varvec{\lambda}} (t) \)] for MECC may be approximated following the procedure discussed in Genest et al. (2009) as follows:

1.
Generate random variables [U_{1},U_{2}] with sample N from the MECC derived, where N is greater than the sample size of the observed dataset.

2.
Approximate [\( K_{\varvec{\lambda}} (t) \)] using:
Comparisons shown in Fig. 3 conclude that (i) both fitted Gumbel–Hougaard (GH) copula and MECC derived may properly represent the true GH (θ = 4.5) population through the comparison of the Kendall distribution plots; (ii) visually, there is minimal difference of GH fitted with empirical marginals (b) and the MECC derived based on sample or population Spearman’s rho (c and d); (iii) the reason for the minimal difference is due to the rankbased empirical distribution does not impose any external bias on parameter estimation for the parametric copulas; and the MECC derived here does not rely on the actual marginal values, but the population moments about origin for the uniformly distributed variables; and (iv) though the POMEbased marginals well represent the univariate random variables, they do introduce external bias to the estimation of parametric copulas (a).
Overall, from the bivariate analysis of sample data, MECC may be directly applied to model the dependence structure of the random variables. In the case of the MECC application, the impact of the marginal distributions is eliminated. In the next section, we will use the real watershed data as a case study to further illustrate the copula–entropy theory as well as risk analysis.
Case study with real watershed data
Collected from Flume 1 at Walnut Gulch Watershed in Arizona, the annual maximum flood data [i.e., peak discharge (Q) and flood volume (V)] from 1957 to 2012 were considered for the case study. Based on the findings from analysis of sample data, the case study proceeded as follows: (i) the POMEbased univariate distribution was applied to model the univariate peak discharge and flood volume; and (ii) the MECC was applied to model the dependence between peak discharge and flood volume.
Univariate analysis of peak discharge and flood volume
As discussed in “Univariate analysis of peak discharge and flood volume” section, the moments about origin for the scaled variables were considered as constraints to capture the shape and mode of the univariate flood variables. Choosing d = 0.1 in Eq. (36), Table 6 lists the sample statistics for the scaled variables. In Table 6, T and P denote the test statistic and the corresponding P value to evaluate whether kurtosis was significantly different from 3 using
In Eqs. (44a)–(44c), γ_{2} and γ ^{ex}_{2} denote the sample kurtosis and excessive kurtosis; n is the sample size; SEK is the standard error of kurtosis; and T is the test statistic with the underlying distribution of standard normal distribution.
Results in Table 6 indicate that the first three moments about origin were necessary to derive the POMEbased distribution for the scaled peak discharge and flood volume variables. The Lagrange multipliers were optimized and listed in Table 7. Figure 4 compares the POMEbased probability density to the histogram, as well as the POMEbased CDF to the empirical CDF. Comparisons confirmed the appropriateness of the POMEbased univariate distribution.
Bivariate flood frequency analysis with MECC
Let U and V represent the univariate marginals for peak discharge and flood volume, the same constraints to construct MECC for sample data [i.e., \( E\left( U \right),E\left( {U^{2} } \right),E\left( V \right),E\left( {V^{2} } \right), E(UV) \)] were applied to model the dependence of peak discharge and flood volume. The Lagrange multipliers were optimized by minimizing the objective function of Eq. (31a) with b = 0.
With the observed data, sample Spearman’s rho was computed as \( \hat{\rho }_{\text{spearman}} = 0.9419 \), we approximated E(UV) from sample Spearman’s rho as \( E(UV) = \frac{{\rho_{\text{spearman}} + 3}}{12} \approx 0.3285 \). With these constraints, the copula density function for the MECC was obtained to model bivariate flood frequency as
Using Eq. (45), Fig. 5 compares (a) the \( C\left( {u,1} \right), C(1,v) \) to the corresponding empirical and POMEbased marginals, and (b) the approximated parametric Kendall distribution for MECC to the empirical Kendall distribution. Comparisons in Fig. 5 indicated that (a) the MECC constructed successfully fulfilled the copula properties of \( C\left( {u,1} \right) = u, C\left( {1,v} \right) = v \); and (b) there was a good agreement between the empirical and parametric (i.e., MECC) Kendall distributions, which indicated the appropriateness of the MECC constructed. Applying the POMEbased univariate distribution, Fig. 6 plots the simulated random variates versus the observed random variables. Figure 6 shows the dependence structure was well preserved with the application of MECC and POMEbased marginals. To further compare the MECC with the empirical copula, Fig. 7 compares the copula and the survival copula with different contour levels. The plot on the left is for the copula function, while the plot on the right is for the survival copula. As shown in Fig. 7, there is good agreement between the contours obtained from the empirical copula (and its survival copula) and the MECC (and its survival copula) for different probability levels. This finding further assured the appropriateness of the MECC.
Risk analysis
With the assurance to apply the MECC to model the dependence of annual flood sequences (i.e., peak discharge and flood volume), one may proceed to study the associated risk measure, which may be used as an engineering design criterion. In hydrology and water engineering, risk has commonly been assessed through the return period. In what follows, the joint return period of “AND” case was applied for risk analysis. The joint return period of “AND” case is given as
To assess the return period “AND” case, the peak discharge and flood volume with univariate CDF of P = 0.8, 0.9, 0.96, and 0.98 were used. Given the limitation of the sample size (n = 56), P = 0.99 was not chosen for the study for comparison purposes. Table 8 lists the joint return period estimated from both empirical copula and MECC. Results in Table 8 indicated the following:

(i)
There was a small difference between the joint CDFs computed from empirical copula and the MECC. The absolute relative difference was in the range from 0.96% for C(0.8,0.8) to 2.17% for C(0.8,0.9). Thus, in regard to the joint CDF, the differences were insignificant.

(ii)
Though the difference with joint CDF estimated may not be significant, it resulted in larger differences in regard to the “AND” case return period. It is seen that with the increased marginal probability, the discrepancy also increased between the T_{and} estimated from empirical copula and the MECC.

(iii)
There was an interesting finding which was in agreement with T_{and} estimated from empirical copula and MECC. Using volume = 6.44 × 10^{5} m^{3} corresponding to P = 0.96 as an example, the joint return period computed from smaller peak discharge (e.g., Q = 73.2 cms corresponding to P = 0.8) was less than that computed with larger peak discharge (e.g., Q = 109.9 cms). This was true in reality, since it was more likely for (Q ≥ 109.9 cms and V ≥ 6.44 × 10^{5} m^{3}) to occur simultaneously compared to that for (Q ≥ 73.2 cms and V ≥ 6.44 × 10^{5} m^{3}). This finding was also in the agreement that higher discharge was most likely associated higher flood volume. This scenario also happened for large flood volume with relatively low peak discharge and vice versa.
Discussion and conclusions
In this study, we investigate the copula–entropy theory in bivariate analysis. Using the sample data with the known univariate populations (i.e., gamma and lognormal) and known dependence (Gumbel–Hougaard), it is concluded that the POMEbased distribution derived may model the univariate distribution well. There is minimal difference for POMEbased distribution based on the moment of the observed variable and that derived based on the scaled variable (i.e., scaling the observed variable to [0,1]). To avoid the improper integrals, the scaled variable is suggested to derive the POMEbased distribution. Comparing to the true Gumbel–Hougaard copula, the MECC derived using the constraints of E(U), E(U^{2}), E(V), E(V^{2}), and E(UV) can properly model the dependence structure of the sample data. The MECC constructed successfully fulfills the fundamental properties of the copula, i.e., C(u,1) = u; C(1,v) = v. In addition, the derived MECC can well present the true dependence structure represented with the Gumbel–Hougaard copula.
Using the real watershed data (i.e., Flume 1 at Walnut Gulch, Arizona), the case study shows the appropriateness of POMEunivariate distribution of scaled variable to model the univariate distribution for the observed variates. With the constraints E(U), E(U^{2}), E(V), and E(V^{2}) converging to the population moments of the uniform distributed variables as \( E(U^{i} ) = E(V^{i} ) = {1 \mathord{\left/ {\vphantom {1 {(i + 1)}}} \right. \kern0pt} {(i + 1)}} \); the MECC constructed only depends on the rankbased dependence measure (in this case, Spearman’s rho). The derived MECC properly models the dependence of annual peak discharge and flood volume, which is independent of the marginal distributions (nonparametric or parametric). The evaluation of the flood risk (using “AND” case return period) indicates that the MECC copula reasonably represents the change of the return period of “AND” case.
Overall, the study concludes as follows:

(i)
For the bivariate random variables investigated, the MECC may be easily and efficiently applied to model the dependence structure. Unlike other copulas, the MECC is uniquely defined for a given set of constraints. Its uniqueness allows one universal solution for the proposed frequency analysis.

(ii)
Similar to other copula families (e.g., Archimedean copulas, metaelliptical copulas, vine copulas, etc.), the MECC may be applied for multivariate analysis in hydrology and water engineering, including multivariate rainfall analysis, multivariate drought analysis, spatial analysis of drainage networks, and spatial analysis of water quality as few examples.

(iii)
The bivariate MECC copula may be easily extended to higher dimensions. For example, for the ddimensional variables \( \left[ {X_{ 1} ,X_{ 2} , \ldots ,X_{d} } \right] \) with the marginals of \( U_{i} = F_{i} (X_{i} ),i = 1,2, \ldots ,d \); the MECC may be constructed using the set of constraints, i.e., marginal \( E(U_{i}^{r} ) = {1 \mathord{\left/ {\vphantom {1 {(r + 1)}}} \right. \kern0pt} {(r + 1)}},\quad \, i = 1,2, \ldots ,d \) and pairwise \( E(U_{i} U_{j} );\quad \, i,j \in [1,d],i \ne j \) estimated from rankbased Spearman’s coefficient of correlation. The same optimization procedure applied for the bivariate case may be applied to construct the MECC for dependence structure in higher dimensions.
References
Aas K, Czado C, Frigessi A, Bakken H (2007) Paircopula constructions of multiple dependence. Insur Math Econ. https://doi.org/10.1016/j.insmatheco.2007.02.001
Aghakouchak A (2014) Entropy–copula in hydrology and climatology. J Hydrometeorol 15:2176–2189. https://doi.org/10.1175/jhmd130207.1
Arya FK, Zhang L (2017) Copulabased Markov process for forecasting and analyzing risk of water quality time series. J Hydrol Eng 22(6):04017005. https://doi.org/10.1061/(asce)he.19435584.00001494
Chen L, Singh VP, Guo S (2013) Measure of correlation between river flows using entropy–copula theory. J Hydrol Eng 18(12):1591–1608. https://doi.org/10.1061/(asce)he.19435584.0000714
Chu B (2011) Recovering copulas from limited information and an application to asset allocation. J Bank Finance 35:1824–1842. https://doi.org/10.1016/j.jbankfin.2010.12.011
Cobb L, Koppstein P, Chen NH (1983) Estimation and moment recursion relations for multimodal distributions of the exponential family. J Am Stat Assoc 78:124–130
Falk M, Reiss RD (2005) On pickands coordinates in arbitrary dimensions. J Multivar Anal 92:426–453
Fang HB, Fang KT, Kotz S (2002) The metaelliptical distributions with given marginals. J Multivar Anal 82:1–16
Genest C, Favre AC, Béliveau J, Jacques C (2007) Metaelliptical copulas and their use in frequency analysis of multivariate hydrological data. Water Resour Res 43:W09401. https://doi.org/10.1029/2006wr005275
Genest C, Remillard B, Beaudoin D (2009) Goodnessoffit tests for copulas: a review and a power study. Insur Math Econ 44(2):199–213. https://doi.org/10.1016/j.insmatheco.2007.10.005
Gudendorf G, Segers J (2009) Extremevalue copulas. arXiv:0911.1015v2
Hao Z, Singh VP (2012) Entropy–copula method for singlesite monthly streamflow simulation. Water Resour Res 48:W06604. https://doi.org/10.1029/wr011419
Jaynes ET (1957) Information theory and statistical mechanics. Phys Rev 106:620–630
Joe H (2014) Dependence modeling with copulas. CRC Press, Boca Raton
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22:79–86
Nelsen RB (2006) An introduction to copulas, 2nd edn. Springer Science + Business Media, Inc., Berlin
Pham MT, Vernieuwe H, Baets BD, Willems P, Verhoest NEC (2016) Stochastic simulation of precipitationconsistent daily reference evapotranspiration using vine copulas. Stoch Environ Res Risk Assess 30:2197–2214. https://doi.org/10.1007/s0047701511817
Pickands J (1981) Multivariate extreme value distribution. Bull Int Stat Inst 49:859–878
Renyi A (1951) On measure of entropy and information. In: Proceedings, 4th Berkeley symposium, mathematics, statistics, and probability, Berkeley, California, pp 547–561
Requena AI, Chebana F, Mediero L (2016a) A complete procedure for multivariate indexflood model application. J Hydrol 535:559–580. https://doi.org/10.1016/j.jhydrol.2016.02.004
Requena AI, Flores I, Mediero L, Garrote L (2016b) Extension of observed flood series by combining a distributed hydrometeorological model and a copulabased model. Stoch Environ Res Risk Assess 30:1363–1378. https://doi.org/10.1007/2004770151138x
Salvadori G, Michele CD (2015) Multivariate realtime assessment of droughts via copulabased multisite hazard trajectories and fans. J Hydrol 526:101–115. https://doi.org/10.1016/j.jhydrol.2014.11.056
Shannon CE (1948) A mathematical theory of communication. Bell Syst Technol J 27:379–423
Singh VP (1998) Entropybased parameter estimation in hydrology. Springer, Dordrecht
Singh VP, Rajagopal AK (1986) A new method of parameter estimation for hydrologic frequency analysis. Hydrol Sci Technol 2(3):33–40
Sklar M (1959) Fonctions de repartition an dimensions et leurs marges. Universite Paris, Paris, p 8
Song S, Singh VP (2010) Metaelliptical copulas for drought frequency analysis of periodic hydrologic data. Stoch Environ Res Risk Assess 24(3):425–444. https://doi.org/10.1007/s0047700903311
Sraj M, Bezak N, Brilly M (2015) Bivariate flood frequency analysis using the copula function: a case study of the Litija station on the Sava River. Hydrol Process 29:225–238. https://doi.org/10.1002/hyp.10145
Tsallis C (1988) Possible generalizations of Boltzmann–Gibbs statistics. J Stat Phys 52(1/2):479–487
Verneiuwe H, Vandenberghe S, Baets BD, Verhoest NEC (2015) A continuous rainfall model based on vine copulas. Hydrol Earth Syst Sci 19:2685–2699. https://doi.org/10.5194/hess1926852015
Zellner A, Highfield RA (1988) Calculation of maximum entropy distribution and approximation of marginal posterior distributions. J Econom 37:95–209
Zhang L, Singh VP (2012) Bivariate rainfall and runoff analysis using entropy and copula theories. Entropy 14:1784–1812. https://doi.org/10.3390/e14091784
Zhang L, Singh VP (2014) Joint conditional probability distributions of runoff depth and peak discharge using entropy theory. J Hydrol Eng 19(6):1150–1159
Authors’ contributions
VPS conceptualized the paper, helped with data interpretation and crafting the manuscript. LZ did analysis, processed the data, constructed all the graphs and wrote the first draft. Both authors read and approved the final manuscript.
Acknowledgements
No applicable.
Competing interests
The authors declare that they have no competing interests.
Availability of data and materials
The data used is in public domain and is available for anyone to use.
Consent for publication
We consent for publication.
Ethics approval and consent to participate
Not applicable.
Funding
No funding source is available.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Singh, V.P., Zhang, L. Copula–entropy theory for multivariate stochastic modeling in water engineering. Geosci. Lett. 5, 6 (2018). https://doi.org/10.1186/s405620180105z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s405620180105z