Discrete Probability Distributions
When analyzing data, we often deal with variables whose outcomes are determined by chance. A random variable is a numerical description of the outcome of an experiment. A discrete random variable can take on a countable number of distinct values (e.g., the number of potholes on a 10km stretch of road, or the number of defective bricks in a pallet).
Probability Mass Functions and Mathematical Expectation
Probability Mass Function (PMF), or
A function that assigns a probability to each possible value of a discrete random variable. It must satisfy two conditions:
- for all .
- .
Cumulative Distribution Function (CDF),
The probability that the random variable will take a value less than or equal to .
Mathematical Expectation
Expected Value (Mean), or
The long-run average value of the random variable over infinitely many trials. It is the center of the probability distribution.
Variance, or
A measure of the dispersion or spread of the probability distribution around the mean.
Alternatively, it can be calculated more easily using the computational formula:
Common Discrete Distributions in Engineering
The Binomial Distribution
Binomial Distribution
Applicable when:
- There are a fixed number of trials ().
- Each trial has only two possible outcomes (Success or Failure).
- The probability of success () remains constant for each trial.
- The trials are mutually independent.
The probability of exactly successes in trials is:
- Mean:
- Variance:
The Poisson Distribution
Poisson Distribution
Used for rare events where the exact number of trials is effectively infinite and is very small, but the average rate of occurrence () is known. Examples include the number of traffic accidents per month at a given intersection, or the number of flaws in a 100m reel of fiber optic cable.
The probability of exactly events occurring in a given interval is:
- Mean:
- Variance:
The Geometric and Negative Binomial Distributions
Geometric Distribution
Models the number of independent trials needed to get the first success. (e.g., How many times must we test a newly designed joint until we observe the first failure, assuming a constant failure probability ?)
- Mean:
- Variance:
Negative Binomial Distribution
A generalization of the geometric distribution. It models the number of independent trials needed to get exactly successes.
The Hypergeometric Distribution
Hypergeometric Distribution
Unlike the Binomial distribution where is constant (sampling with replacement), the Hypergeometric distribution is used when sampling without replacement from a finite population of size , containing exactly successes. (e.g., Selecting 5 concrete cylinders from a batch of 50, where 3 are known to be defective).
- Mean:
Interact with the simulation below to visualize various discrete probability distributions.
Engineering Data Analysis
Discrete Probability Distributions Explorer
Theoretical Properties
Compare the geometric and hypergeometric distributions under different parameters to see how sampling without replacement alters success probabilities.
Engineering Data Analysis • Topic 5
Discrete Probability Distributions Sandbox
- Random Variables: Numerical values assigned to experimental outcomes.
- Expected Value (): The long-run average of a discrete distribution.
- Binomial: Used for independent trials with exactly two outcomes (success/failure) and constant probability .
- Poisson: Used for modeling the number of rare events occurring within a continuous interval (time, area, volume).
- Geometric/Negative Binomial: Focuses on the number of trials needed to achieve a specified number of successes.
- Hypergeometric: Used for finite populations when sampling without replacement (probability changes trial-to-trial).