Joint Probability Distributions

Joint probability mass/density functions, marginal and conditional distributions, covariance, and correlation.
In many engineering applications, we need to understand the relationship between two or more random variables simultaneously. For example, a structural engineer might study the joint distribution of wind speed (XX) and atmospheric pressure (YY) during a hurricane, or a transportation engineer might model the number of cars (XX) and trucks (YY) arriving at a toll booth.

Joint Probability Mass and Density Functions

Describing the simultaneous behavior of multiple random variables.

Joint Probability Mass Function (Discrete)

For two discrete random variables XX and YY, the joint probability mass function f(x,y)f(x,y) gives the probability that XX takes the specific value xx AND YY takes the specific value yy simultaneously.
f(x,y)=P(X=x,Y=y) f(x, y) = P(X = x, Y = y)
It must satisfy two conditions:
  • f(x,y)0f(x, y) \ge 0 for all (x,y)(x, y).
  • xyf(x,y)=1\sum_{x} \sum_{y} f(x, y) = 1.

Joint Probability Density Function (Continuous)

For two continuous random variables XX and YY, the joint probability density function f(x,y)f(x,y) represents the probability that (X,Y)(X, Y) falls within a specific two-dimensional region RR in the xyxy-plane. The probability is the volume under the surface f(x,y)f(x,y) over region RR.
P((X,Y)R)=Rf(x,y)dxdy P((X, Y) \in R) = \iint_R f(x, y) \, dx \, dy
It must satisfy two conditions:
  • f(x,y)0f(x, y) \ge 0 for all (x,y)(x, y).
  • f(x,y)dxdy=1\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} f(x, y) \, dx \, dy = 1.

Marginal Distributions

Isolating the behavior of one variable from the joint distribution.
Sometimes we have the joint distribution of XX and YY, but we only care about the distribution of XX alone, regardless of YY. This is called the marginal distribution.

Marginal Probability Distributions

To find the marginal distribution of one variable, we sum (or integrate) out the other variable over its entire range.
  • Discrete case for XX:
g(x)=yf(x,y) g(x) = \sum_{y} f(x, y)
  • Continuous case for XX:
g(x)=f(x,y)dy g(x) = \int_{-\infty}^{\infty} f(x, y) \, dy
Similarly, h(y)h(y) is the marginal distribution for YY found by summing or integrating out xx.

Conditional Distributions and Independence

How knowledge of one variable affects the probability distribution of another.

Conditional Probability Distribution

The probability distribution of XX, given that YY has taken a specific value yy. This is analogous to basic conditional probability (P(AB)=P(AB)/P(B)P(A|B) = P(A \cap B) / P(B)).
f(xy)=f(x,y)h(y)provided h(y)>0 f(x|y) = \frac{f(x, y)}{h(y)} \quad \text{provided } h(y) > 0
Similarly, f(yx)=f(x,y)g(x)f(y|x) = \frac{f(x, y)}{g(x)} provided g(x)>0g(x) > 0.

Independence of Random Variables

Two random variables XX and YY are independent if and only if their joint probability distribution is the product of their marginal distributions for all possible values of (x,y)(x, y).
f(x,y)=g(x)h(y) f(x, y) = g(x) \cdot h(y)
If this holds true, knowing the value of XX gives no information about the value of YY. (e.g., The compressive strength of concrete from Plant A vs. Plant B).

Covariance and Correlation

Measuring the linear relationship between two random variables.

Covariance (σxy\sigma_{xy})

A measure of how much two random variables change together. A positive covariance indicates that when XX is above its mean, YY tends to be above its mean (e.g., traffic volume and noise levels). A negative covariance indicates an inverse relationship (e.g., age of asphalt and its flexibility).
σxy=E[(XμX)(YμY)]=E[XY]μXμY \sigma_{xy} = E[(X - \mu_X)(Y - \mu_Y)] = E[XY] - \mu_X\mu_Y
  • If XX and YY are statistically independent, their covariance is zero (σxy=0\sigma_{xy} = 0).
  • However, a covariance of zero does not necessarily mean they are independent (they could have a non-linear relationship).

Correlation Coefficient (ρxy\rho_{xy})

A standardized measure of the linear relationship between two variables. Covariance depends on the units of XX and YY, making it hard to interpret the strength of the relationship. The correlation coefficient scales covariance by the standard deviations of both variables, producing a dimensionless value between -1 and 1.
ρxy=σxyσxσy \rho_{xy} = \frac{\sigma_{xy}}{\sigma_x \sigma_y}
  • ρxy=1\rho_{xy} = 1: Perfect positive linear relationship.
  • ρxy=1\rho_{xy} = -1: Perfect negative linear relationship.
  • ρxy=0\rho_{xy} = 0: No linear relationship.

The Bivariate Normal Distribution

The foundational model for two correlated continuous variables.

Bivariate Normal Distribution

When two continuous random variables are individually normally distributed and correlated, their joint behavior is described by the bivariate normal distribution. Its PDF forms a 3-dimensional bell surface (a mound) whose orientation depends on the correlation ρ\rho.
Key properties:
  • The marginal distributions g(x)g(x) and h(y)h(y) are both normal.
  • The conditional distributions f(xy)f(x|y) and f(yx)f(y|x) are both normal.
  • If the correlation ρxy=0\rho_{xy} = 0 for a bivariate normal distribution, then XX and YY are strictly independent. (This is a special property; for other distributions, ρ=0\rho=0 does not guarantee independence).
Key Takeaways
  • Joint Distributions (f(x,y)f(x,y)): Describe the simultaneous behavior of two random variables.
  • Marginal Distributions (g(x),h(y)g(x), h(y)): Isolate one variable by summing or integrating out the other.
  • Conditional Distributions (f(xy)f(x|y)): The behavior of XX given a specific value of YY.
  • Independence: If XX and YY are independent, f(x,y)=g(x)h(y)f(x,y) = g(x) \cdot h(y).
  • Covariance and Correlation: Measure the linear relationship between variables. Correlation (ρ\rho) is standardized, always falling between -1 and 1.
  • Bivariate Normal: The standard 3D bell-shaped curve for two correlated continuous variables.