Discrete Probability Distributions

Discrete Probability Distributions

Expected value, mathematical expectation, and common discrete distributions: Binomial, Poisson, Geometric, Negative Binomial, and Hypergeometric.

When analyzing data, we often deal with variables whose outcomes are determined by chance. A random variable is a numerical description of the outcome of an experiment. A discrete random variable can take on a countable number of distinct values (e.g., the number of potholes on a 10km stretch of road, or the number of defective bricks in a pallet).

Probability Mass Functions and Mathematical Expectation

The foundational math behind discrete random variables.

Probability Mass Function (PMF), $f (x)$ or $P (X = x)$

A function that assigns a probability to each possible value of a discrete random variable. It must satisfy two conditions:

$f(x) \ge 0$ for all $x$ .
$\sum_{x} f(x) = 1$ .

Cumulative Distribution Function (CDF), $F (x)$

The probability that the random variable $X$ will take a value less than or equal to $x$ .

F(x) = P(X \le x) = \sum_{t \le x} f(t)

Mathematical Expectation

The expected value represents the theoretical mean of the random variable.

Expected Value (Mean), $μ \mu$ or $E [X]$

The long-run average value of the random variable over infinitely many trials. It is the center of the probability distribution.

\mu = E[X] = \sum_{x} x \cdot f(x)

Variance, $\sigma^2$ or $V (X)$

A measure of the dispersion or spread of the probability distribution around the mean.

\sigma^2 = E[(X - \mu)^2] = \sum_{x} (x - \mu)^2 \cdot f(x)

Alternatively, it can be calculated more easily using the computational formula:

\sigma^2 = E[X^2] - (E[X])^2

Common Discrete Distributions in Engineering

Specific models used to describe common engineering scenarios.

The Binomial Distribution

Models the number of successes in a fixed number of independent trials.

Binomial Distribution

Applicable when:

There are a fixed number of trials ( $n$ ).
Each trial has only two possible outcomes (Success or Failure).
The probability of success ( $p$ ) remains constant for each trial.
The trials are mutually independent.

The probability of exactly $x$ successes in $n$ trials is:

P(X = x) = \binom{n}{x} p^x (1-p)^{n-x} \quad \text{for } x = 0, 1, \dots, n

Mean: $\mu = np$
Variance: $\sigma^2 = np(1-p)$

The Poisson Distribution

Models the number of events occurring in a fixed interval of time or space.

Poisson Distribution

Used for rare events where the exact number of trials $n$ is effectively infinite and $p$ is very small, but the average rate of occurrence ( $\lambda$ ) is known. Examples include the number of traffic accidents per month at a given intersection, or the number of flaws in a 100m reel of fiber optic cable.

The probability of exactly $x$ events occurring in a given interval is:

P(X = x) = \frac{\lambda^x e^{-\lambda}}{x!} \quad \text{for } x = 0, 1, 2, \dots

Mean: $\mu = \lambda$
Variance: $\sigma^2 = \lambda$

Note: A unique property of the Poisson distribution is that its mean equals its variance.

The Geometric and Negative Binomial Distributions

Models the number of trials needed to achieve a specific number of successes.

Geometric Distribution

Models the number of independent trials $X$ needed to get the first success. (e.g., How many times must we test a newly designed joint until we observe the first failure, assuming a constant failure probability $p$ ?)

P(X = x) = (1-p)^{x-1}p \quad \text{for } x = 1, 2, 3, \dots

Mean: $\mu = \frac{1}{p}$
Variance: $\sigma^2 = \frac{1-p}{p^2}$

Negative Binomial Distribution

A generalization of the geometric distribution. It models the number of independent trials $X$ needed to get exactly $r$ successes.

P(X = x) = \binom{x-1}{r-1} p^r (1-p)^{x-r} \quad \text{for } x = r, r+1, \dots

The Hypergeometric Distribution

Models sampling without replacement.

Hypergeometric Distribution

Unlike the Binomial distribution where $p$ is constant (sampling with replacement), the Hypergeometric distribution is used when sampling without replacement from a finite population of size $N$ , containing exactly $K$ successes. (e.g., Selecting 5 concrete cylinders from a batch of 50, where 3 are known to be defective).

P(X = x) = \frac{\binom{K}{x} \binom{N-K}{n-x}}{\binom{N}{n}}

Mean: $\mu = n \left(\frac{K}{N}\right)$

Interact with the simulation below to visualize various discrete probability distributions.

Engineering Data Analysis

Discrete Probability Distributions Explorer

Loading chart...

Number of Trials (

n

)10

Success Probability (

p

)0.50

Theoretical Properties

Mean (

\mu

):5.00

Variance (

\sigma^2

):2.50

Compare the geometric and hypergeometric distributions under different parameters to see how sampling without replacement alters success probabilities.

Engineering Data Analysis • Topic 5

Discrete Probability Distributions Sandbox

Distribution Model

Success Prob (

p

)0.30

Loading chart...

Mean (Expected Value)3.333

Variance7.778

Key Takeaways

Random Variables: Numerical values assigned to experimental outcomes.
Expected Value ( $E[X]$ ): The long-run average of a discrete distribution.
Binomial: Used for independent trials with exactly two outcomes (success/failure) and constant probability $p$ .
Poisson: Used for modeling the number of rare events occurring within a continuous interval (time, area, volume).
Geometric/Negative Binomial: Focuses on the number of trials needed to achieve a specified number of successes.
Hypergeometric: Used for finite populations when sampling without replacement (probability changes trial-to-trial).

PreviousConditional Probability - Examples & Applications

Quiz Me

NextDiscrete Probability Distributions - Examples & Applications

Probability Mass Functions and Mathematical Expectation

Probability Mass Function (PMF), f(x)f(x)f(x) or P(X=x)P(X=x)P(X=x)

Cumulative Distribution Function (CDF), F(x)F(x)F(x)

Mathematical Expectation

Expected Value (Mean), μ\muμ or E[X]E[X]E[X]

Variance, σ2\sigma^2σ2 or V(X)V(X)V(X)

Common Discrete Distributions in Engineering

The Binomial Distribution