Geometric and Negative Binomial Distributions

Understanding how many attempts you'll need before achieving a success is a fundamental question in probability. Whether you're testing products until you find a defective unit, running website experiments until a user converts, or modeling system failures, the geometric distribution and negative binomial distribution provide the essential mathematical framework. These discrete distributions move beyond a single trial's success probability to model the waiting time for success events, making them indispensable for reliability analysis, quality control, and strategic planning in data science.

The Foundation: The Geometric Distribution

The geometric distribution models the number of independent Bernoulli trials needed to get the first success. A Bernoulli trial is an experiment with only two outcomes: "success" (with probability $p$ ) and "failure" (with probability $q = 1 - p$ ). The trials are identical and independent.

The key question it answers is: "How many trials will it take until I succeed for the first time?" Let the random variable $X$ represent the number of trials until the first success. If you succeed on the $k$ -th trial, it means you had $k - 1$ consecutive failures followed by one success.

The probability mass function (PMF) for a geometrically distributed random variable $X \sim Geometric (p)$ is:

$P (X = k) = (1 - p)^{k - 1} p = q^{k - 1} p$

for $k = 1, 2, 3, ...$ . This formula is intuitive: it multiplies the probability of $k - 1$ failures, $(1 - p)^{k - 1}$ , by the probability of one success, $p$ , on the final trial.

Example: Suppose a quality inspector tests light bulbs, and each has a 2% probability of being defective (a "success" in this context, as we are looking for a defect). The probability that the first defective bulb is found on the 10th test is $P (X = 10) = (0.98)^{9} * 0.02 \approx 0.0167$ .

Two important characteristics of this distribution are its mean and variance. The expected value (mean number of trials until success) is $E [X] = \frac{1}{p}$ . Intuitively, if you have a 1-in-10 chance of success, you expect to need about 10 trials. The variance, which measures the spread, is $Var (X) = \frac{1 - p}{p ^{2}} = \frac{q}{p ^{2}}$ .

The Memoryless Property: A Defining Feature

A unique and defining characteristic of the geometric distribution is its memoryless property. This property states that the probability of needing an additional $m$ trials is independent of how many failures you've already had. Formally:

$P (X > k + m ∣ X > k) = P (X > m)$

for integers $k, m \geq 0$ .

Imagine you've already flipped a fair coin 10 times without getting heads. The memoryless property tells you that the probability you'll need more than 5 additional flips is the same as if you had just started: the probability of needing more than 5 flips from the beginning. The coin has no "memory" of the past failures. This property is shared only by the geometric and exponential distributions and is critical for modeling systems where the future is independent of the past, like component reliability.

Extending to Multiple Successes: The Negative Binomial Distribution

The negative binomial distribution is a direct generalization of the geometric distribution. Instead of waiting for the first success, it models the number of trials needed to achieve the r-th success, where $r$ is a fixed positive integer.

Let the random variable $Y$ represent the number of trials required to get the $r$ -th success. For $Y$ to equal $k$ , the $k$ -th trial must be a success (the $r$ -th one), and the previous $k - 1$ trials must contain exactly $r - 1$ successes and $(k - 1) - (r - 1) = k - r$ failures. The number of ways to arrange these first $r - 1$ successes among the first $k - 1$ trials is given by the binomial coefficient $(r - 1 k - 1)$ .

Therefore, the PMF for $Y \sim NegativeBinomial (r, p)$ is:

$P (Y = k) = (r - 1 k - 1) p^{r} (1 - p)^{k - r}$

for $k = r, r + 1, r + 2, ...$ . Notice that when $r = 1$ , this simplifies exactly to the geometric PMF: $(0 k - 1) p^{1} (1 - p)^{k - 1} = (1 - p)^{k - 1} p$ .

Example: Returning to the light bulb inspector with $p = 0.02$ , the probability that the third defective bulb is found on the 50th test is: $P (Y = 50) = (2 49) (0.02)^{3} (0.98)^{47} \approx 0.0378.$

The expected value and variance for the negative binomial distribution scale with $r$ . The expected value is $E [Y] = \frac{r}{p}$ . To get $r$ successes, you expect to need $r$ times the average wait for one success. The variance is $Var (Y) = \frac{r ( 1 - p )}{p ^{2}}$ .

Applications in Reliability and Quality Testing

These distributions are powerful tools for modeling and decision-making in applied settings.

Reliability Engineering: The geometric distribution can model the number of cycles or time periods until a mechanical component fails (with discretized time). The memoryless property implies the component does not "wear out" in a dependent way—each cycle presents the same failure risk. The negative binomial can model the time until multiple system failures.
Quality Control and Acceptance Sampling: A classic application is in sequential analysis. An inspector might decide to reject a batch if a certain number of defects ( $r$ ) are found within a given number of items tested. The negative binomial distribution directly models the "number tested until the $r$ -th defect," helping to design efficient sampling plans that balance inspection cost with risk.
Clinical Trials & Marketing: In a clinical trial, you might enroll patients until you observe $r$ patients who respond to a treatment. In marketing, you might run ads until you achieve $r$ conversions. The negative binomial distribution helps estimate the total campaign size or trial duration needed and the associated costs.

Common Pitfalls

Confusing the Definition of the Random Variable: The most common error is mixing up "trials until the first success" with "failures before the first success." The geometric distribution defined here counts the trial on which success occurs ( $k \geq 1$ ). An alternative definition counts the number of failures before the first success ( $k \geq 0$ ). Always check which definition your textbook or software uses, as the formulas for PMF, mean, and variance differ slightly.
Misapplying the Memoryless Property: This property only holds for the geometric (and exponential) distribution. You cannot assume a negative binomial process is memoryless once you have one or more successes. The probability of waiting for the next success changes after the first success because the remaining number needed ( $r$ ) decreases.
Using the Wrong Distribution for a Fixed Number of Trials: If the number of trials $n$ is fixed in advance and you are counting the number of successes, you use the binomial distribution. The geometric and negative binomial distributions are used when the number of trials is the variable you are modeling, and the number of successes is fixed.
Ignoring the Independence Assumption: Both distributions require independent Bernoulli trials with constant probability $p$ . If the probability of success changes between trials (e.g., sampling without replacement from a small batch), these models are invalid. Always verify that the "i.i.d." (independent and identically distributed) assumption holds for your scenario.

Summary

The geometric distribution models the number of independent trials needed to achieve the first success. Its PMF is $P (X = k) = (1 - p)^{k - 1} p$ , with mean $1/ p$ and a unique memoryless property.
The negative binomial distribution generalizes this, modeling the number of trials needed to achieve the $r$ -th success. Its PMF is $P (Y = k) = (r - 1 k - 1) p^{r} (1 - p)^{k - r}$ , with mean $r / p$ .
These distributions are fundamental for modeling waiting times in applications like reliability analysis, where you model cycles until failure, and quality testing, where you model items inspected until finding a target number of defects.
Avoid common mistakes by carefully defining your random variable, remembering the strict limits of the memoryless property, and ensuring your data meets the assumption of independent, identical trials.

Geometric and Negative Binomial Distributions

Geometric and Negative Binomial Distributions

The Foundation: The Geometric Distribution

The Memoryless Property: A Defining Feature

Extending to Multiple Successes: The Negative Binomial Distribution

Applications in Reliability and Quality Testing

Common Pitfalls

Summary

Write better notes with AI