IB AA: Probability Distributions

Probability distributions are the mathematical engines that power statistical inference, risk assessment, and data-driven decision-making. Understanding them transforms raw data into meaningful predictions, allowing you to quantify uncertainty in everything from genetics to finance. In your IB AA course, mastering the binomial and normal distributions provides a crucial toolkit for modeling both countable events and measurable phenomena in the world around you.

Discrete Random Variables and Their Characteristics

A discrete random variable is one whose possible values can be listed, often because they result from counting. The number of defective items in a batch, the result of rolling a die, and the number of goals in a soccer match are all examples. The behavior of such a variable is fully described by its probability distribution function (pdf), which lists each possible value $x$ and its associated probability $P (X = x)$ .

From the pdf, we derive two fundamental summary measures: expected value (mean) and variance. The expected value, denoted $E (X)$ or $μ$ , represents the long-run average outcome if the experiment were repeated infinitely. It is calculated as a weighted average: $E (X) = \sum x \cdot P (X = x) .$ The variance, $Va r (X)$ or $σ^{2}$ , quantifies the spread or variability of the distribution around the mean. It is the expected value of the squared deviation from the mean: $Va r (X) = E ((X - μ)^{2}) = \sum (x - μ)^{2} \cdot P (X = x) .$ A related measure, the standard deviation $σ$ , is simply the square root of the variance and is expressed in the same units as the variable itself.

The Binomial Distribution

The binomial distribution is a specific and vital model for a common type of discrete scenario. It applies when you have a fixed number $n$ of independent trials, each resulting in just one of two outcomes: success (with constant probability $p$ ) or failure (probability $1 - p$ ). The discrete random variable $X$ counts the total number of successes.

A binomial setting is defined by four parameters, often remembered as BINS: Binary outcomes, Independent trials, fixed Number of trials, and constant Success probability $p$ . If these conditions are met, we say $X \sim B (n, p)$ . The probability of getting exactly $k$ successes is given by the formula: $P (X = k) = (k n) p^{k} (1 - p)^{n - k},$ where $(k n) = \frac{n !}{k ! ( n - k )!}$ is the binomial coefficient, counting the number of ways to choose $k$ successes from $n$ trials.

For any binomial distribution, the expected value and variance have straightforward formulas: $E (X) = n p$ and $Va r (X) = n p (1 - p)$ . For example, if you take a 10-question multiple-choice quiz by guessing randomly ( $p = 0.25$ per question), the expected number of correct answers is $10 \times 0.25 = 2.5$ , with a variance of $10 \times 0.25 \times 0.75 = 1.875$ .

Continuous Random Variables and the Normal Distribution

In contrast to discrete variables, a continuous random variable can take any value within an interval. Examples include height, time, or mass. Because there are infinitely many possible values, the probability that a continuous variable equals any single, exact number is zero. Instead, we model probabilities using areas under a probability density function (pdf) curve. The total area under the entire curve is always 1.

The most important continuous model is the normal distribution, characterized by its symmetrical, bell-shaped curve. It is defined entirely by its mean $μ$ (the center of symmetry) and its standard deviation $σ$ (which controls the spread). We denote this as $X \sim N (μ, σ^{2})$ . The distribution follows the empirical rule: approximately 68% of data lies within $μ \pm σ$ , 95% within $μ \pm 2 σ$ , and 99.7% within $μ \pm 3 σ$ .

Standardization and the Inverse Normal

Since every pair $(μ, σ)$ gives a unique normal curve, we use standardization to compare and calculate probabilities across all of them. This process converts any normal value $x$ into a z-score, which tells you how many standard deviations $x$ is from the mean. The formula is: $z = \frac{x - μ}{σ} .$ This transforms our normal distribution $N (μ, σ^{2})$ into the standard normal distribution $Z \sim N (0, 1)$ , which has a mean of 0 and standard deviation of 1. You can then use statistical tables or your GDC to find probabilities like $P (Z < z)$ .

Often, you need to work backwards: given a probability or percentile, you must find the corresponding data value $x$ . This is an inverse normal calculation. For instance, if you know that the top 10% of test scores receive an A, you would find the z-score $z$ such that $P (Z > z) = 0.10$ . Once you have $z$ , you "un-standardize" using the rearrangement of the z-score formula: $x = μ + z σ .$ Your GDC's InvNorm function performs this entire process directly when you input the area (probability) to the left, the mean, and the standard deviation.

Common Pitfalls

Misapplying the Binomial Distribution: The most frequent error is forcing the binomial model onto a situation that violates its assumptions. For example, selecting people without replacement from a small group violates independence. Always check the BINS criteria before using $B (n, p)$ .

Correction: Ask: Are the outcomes binary? Are the trials independent? Is $n$ fixed in advance? Is $p$ constant? If any answer is "no," consider a different model like the hypergeometric.

Confusing Discrete and Continuous Probability Statements: For a continuous variable, $P (X = a) = 0$ . It is meaningless to ask for the probability of an exact height like 170.0 cm. Students often incorrectly try to calculate this.

Correction: Always ask for probabilities as intervals: $P (X < a)$ , $P (a < X < b)$ . For a continuous distribution, $P (X < a)$ and $P (X \leq a)$ are equivalent.

Incorrect Area Interpretation for Inverse Normal: When using your GDC's InvNorm function, you must correctly specify the area to the left of the desired value. Confusing "top 5%" with an area of 0.05 is a critical mistake.

Correction: If the problem states "the highest 5%," the area to the left of the cutoff value is $1 - 0.05 = 0.95$ . Always sketch a bell curve and shade the region corresponding to the given probability to avoid sign errors.

Forgetting to "Un-Standardize": After finding a z-score from a table or calculation, students sometimes forget the final step of converting it back to a value in the original context using $x = μ + z σ$ .

Correction: Treat the z-score as an intermediate step. Always end your answer by stating the value in the terms of the original problem (e.g., "the minimum mass is 12.4 grams").

Summary

Discrete random variables take countable values, described by a probability distribution function (pdf). Their behavior is summarized by the expected value (mean, $μ$ ) and variance ( $σ^{2}$ , a measure of spread).
The binomial distribution $B (n, p)$ models the count of successes in $n$ independent trials with constant success probability $p$ . Its probabilities are calculated with the binomial formula, with $E (X) = n p$ and $Va r (X) = n p (1 - p)$ .
Continuous random variables (like height or time) are modeled by probability density functions, where probabilities correspond to areas under the curve.
The normal distribution $N (μ, σ^{2})$ is the fundamental continuous model, defined by its mean (center) and standard deviation (spread). Probabilities are found using the empirical rule or technology.
Standardization via the z-score formula $z = (x - μ) / σ$ converts any normal distribution to the standard normal $N (0, 1)$ , enabling universal probability lookup.
Inverse normal calculations use a known probability (area) to find a corresponding data value $x$ , often requiring you to find a z-score first and then use $x = μ + z σ$ .

IB AA: Probability Distributions

IB AA: Probability Distributions

Discrete Random Variables and Their Characteristics

The Binomial Distribution

Continuous Random Variables and the Normal Distribution

Standardization and the Inverse Normal

Common Pitfalls

Summary

Write better notes with AI