Statistical Hydrology

Statistical Hydrology

Applying probability theory to hydrologic events to predict return periods, risk, and frequencies.

Introduction

Hydrologic events (floods, droughts, storms) are stochastic (random) in nature. Statistical Hydrology uses probability theory to analyze historical data and predict the likelihood of future extreme events.

T

Return Period (Recurrence Interval)

The average time interval between events equal to or exceeding a certain magnitude ( $x_T$ ).

Return Period vs. Probability

T = \frac{1}{P}

Exceedance Probability ( $P$ )

The probability that an event of magnitude $\ge x$ will occur in any given year. For example:

100-year flood: $T = 100$ , so $P = 1/100 = 0.01$ (1% chance of occurring in any single year).
50-year flood: $T = 50$ , so $P = 0.02$ (2% chance).

R

The probability that an event with return period $T$ will occur at least once in a project life of $n$ years.

Risk Equation

R = 1 - (1 - P)^n = 1 - (1 - \frac{1}{T})^n

Reliability

The probability that the event will not occur in $n$ years.

Reliability Equation

\text{Reliability} = 1 - R = (1 - P)^n

Frequency Analysis

Used to relate the magnitude of extreme events to their frequency of occurrence using probability distributions.

General Frequency Equation

x_T = \bar{x} + K \cdot \sigma

Variables

$x_T$ : Value of variate with return period $T$ (e.g., peak discharge).
$\bar{x}$ : Mean of the data series.
$\sigma$ : Standard deviation of the data series.
$K$ : Frequency factor (depends on the probability distribution and $T$ ).

Gumbel's Extreme Value Distribution (Type I)

Commonly used for flood frequency analysis.

Gumbel's Frequency Factor

K = \frac{y_T - \bar{y}_n}{S_n}

Reduced Variate ( $y_{T}$ )

y_T = -\ln [-\ln (1 - \frac{1}{T})]

Note

Where $\bar{y}_n$ and $S_n$ are reduced mean and standard deviation, which depend only on sample size $N$ .

Log-Pearson Type III Distribution

The standard method for flood frequency analysis in the United States (USGS Bulletin 17B/17C). It applies the general frequency equation to the logarithms of the discharge values ( $y = \log x$ ).

Log-Pearson III Equation

\log x_T = \overline{\log x} + K_z \cdot \sigma_{\log x}

Note

Where $K_z$ is a function of the return period $T$ and the skewness coefficient ( $C_s$ ) of the log-transformed data.

Log-Normal Distribution

A special case of the Log-Pearson Type III distribution where the skewness coefficient of the logarithmic data is exactly zero ( $C_s = 0$ ).

Log-Normal Equation

y_T = \bar{y} + K_z \cdot S_y

Note

Where $y = \ln x$ , $\bar{y}$ is the mean of the logarithms, $S_y$ is the standard deviation of the logarithms, and $K_z$ is the standard normal deviate corresponding to return period $T$ (derived from normal probability tables). Finally, $x_T = e^{y_T}$ .

Plotting Positions

To graphically plot a probability distribution from empirical data, the data points (e.g., annual peak floods) must be ranked in descending order ( $m = 1$ is the largest event). An empirical exceedance probability ( $P$ ) is then assigned to each rank using a plotting position formula.

Weibull Plotting Position

P = \frac{m}{N + 1}

Note

The Weibull formula is the most universally used, where $N$ is the total number of years of record. The corresponding Return Period is $T = (N+1)/m$ . Other formulas include Gringorten and Cunnane.

Confidence Limits

Statistical estimates have inherent uncertainty because they are based on a finite sample of historical data. Confidence limits provide a range within which the true value is expected to lie with a specified probability (e.g., 95% confidence).

Standard Error

The standard error of estimate quantifies the uncertainty in the calculated magnitude $x_T$ . The confidence interval is typically $x_T \pm z_c S_e$ , where $z_c$ is the standard normal variate for the desired confidence level, and $S_e$ is the standard error.

L-Moments in Hydrology

Traditional product moments (mean, variance, skewness) are highly sensitive to outliers in small datasets, which is common in flood records. L-moments are an advanced statistical tool used to estimate distribution parameters more robustly.

Advantages of L-Moments

L-moments are linear combinations of probability weighted moments (PWMs). Because they are linear, they do not square or cube the data values, making them far less susceptible to the influence of extreme outliers compared to traditional variance or skewness. They provide more reliable parameter estimates for distributions like the Generalized Extreme Value (GEV) distribution.

Probable Maximum Flood (PMF)

Probable Maximum Flood (PMF)

The most severe flood considered physically possible in a particular drainage basin, based on comprehensive hydrometeorological analysis of maximum precipitation and hydrologic factors favorable for maximum runoff.

Unlike a 100-year or 500-year flood derived from statistical frequency analysis, the PMF is an absolute theoretical upper bound. It is generated by routing the Probable Maximum Precipitation (PMP) through the basin's hydrologic model, assuming worst-case antecedent soil moisture conditions and peak snowmelt (if applicable).

Design Application

The PMF is strictly used for designing the spillways of high-hazard dams, where structural failure would result in unacceptable loss of human life and catastrophic downstream damage. By designing for the PMF, engineers ensure the dam will never overtop under any foreseeable physical conditions, effectively eliminating the risk of hydrologic failure.

Risk and Reliability

When designing hydraulic structures, engineers must assess the probability that a design event will be exceeded over the lifetime of the structure.

Risk Equation

R = 1 - (1 - P)^n

Variables

$R$ : Risk (probability that the event will occur at least once in $n$ years)
$P$ : Probability of occurrence in any single year ( $P = 1/T$ )
$n$ : Design life of the structure (years)

Reliability

Reliability is the probability that the structure will not fail (i.e., the design event will not be exceeded) during its design life. It is simply $1 - R$ .

Key Takeaways

Hydrologic events cannot be predicted with absolute certainty due to their inherent randomness.
Statistical Hydrology applies probability theory to historical data to estimate the likelihood and magnitude of future extreme events (floods, droughts).
Return Period ( $T$ ) is the statistical average time interval between occurrences of an event of a specific magnitude.
It is the mathematical inverse of the Annual Exceedance Probability ( $P$ ): $T = 1/P$ .
Risk ( $R$ ) is the probability that an event will occur at least once during a project's design life ( $n$ ).
Even a 100-year flood has a 1% chance of occurring in any given year, meaning it could theoretically happen in consecutive years.
Frequency Analysis fits historical data to theoretical probability distributions to extrapolate extreme events beyond the recorded timeframe.
The General Frequency Equation ( $x_T = \bar{x} + K \cdot \sigma$ ) scales the mean by a frequency factor $K$ and standard deviation $\sigma$ .
Gumbel's Extreme Value Type I is traditionally used for maximum annual flood series.
The Log-Pearson Type III distribution is the standard method mandated by US federal agencies for flood frequency analysis.
Plotting Positions like the Weibull Formula ( $P = m/(N+1)$ ) assign empirical probabilities to ranked historical data for graphical comparison against theoretical distributions.
Statistical estimates are uncertain because they rely on finite historical sample sizes.
Confidence Limits define a bound (e.g., 95%) within which the true magnitude of an event is expected to lie.
The width of the confidence interval depends on the Standard Error ( $S_e$ ), which decreases as the length of the historical data record increases.
The Probable Maximum Flood (PMF) is the absolute physical upper limit of flooding for a basin, derived deterministically from the PMP, rather than statistically.
High-hazard dam spillways are designed to safely pass the PMF to ensure zero risk of catastrophic overtopping.

PreviousGroundwater Hydrology - Examples & Applications

Quiz Me

NextStatistical Hydrology - Examples & Applications

Introduction

Return Period (TTT)

Return Period (Recurrence Interval)

Return Period vs. Probability

Exceedance Probability (PPP)

Risk (RRR)

Risk Equation

Reliability

Reliability Equation

Frequency Analysis

General Frequency Equation

Variables

Gumbel's Extreme Value Distribution (Type I)

Gumbel's Frequency Factor

Reduced Variate (yTy_TyT​)

Note

Log-Pearson Type III Distribution

Log-Pearson III Equation

Note

Log-Normal Distribution

Log-Normal Equation

Note

Plotting Positions

Weibull Plotting Position

Note

Confidence Limits

Standard Error

L-Moments in Hydrology

Advantages of L-Moments

Probable Maximum Flood (PMF)

Probable Maximum Flood (PMF)

Design Application

Risk and Reliability

Risk Equation

Variables

Reliability

Return Period ( $T$ )

Exceedance Probability ( $P$ )

Risk ( $R$ )

Reduced Variate ( $y_{T}$ )