Probability Distributions and Expected Value

Understanding probability distributions is fundamental to modeling the uncertainty inherent in the real world. Whether predicting exam scores, manufacturing defects, or financial returns, these mathematical models allow you to quantify randomness and make informed predictions. Mastering discrete and continuous distributions, along with their parameters like expected value, is a core skill for IB Mathematics, enabling you to solve complex problems with confidence.

What is a Probability Distribution?

A probability distribution is a mathematical function that provides the probabilities of occurrence of all possible outcomes for a random variable. It is the complete description of how probability is spread across these outcomes. The key distinction lies in the type of random variable being modeled.

A discrete random variable can only take on a finite or countably infinite number of values. Its distribution is defined by a probability mass function (PMF), which gives the probability $P (X = x)$ for each specific value $x$ . The sum of all probabilities must equal 1: $\sum P (X = x) = 1$ . The associated cumulative distribution function (CDF), $F (x) = P (X \leq x)$ , gives the probability that the variable is less than or equal to a certain value.

In contrast, a continuous random variable can take on any value within an interval or collection of intervals. Its probability is described by a probability density function (PDF), denoted $f (x)$ . Crucially, for a continuous variable, the probability of it taking any single exact value is zero; probability is only meaningful over an interval. The probability that $X$ lies between $a$ and $b$ is found by calculating the area under the PDF curve: $P (a < X < b) = \int_{a}^{b} f (x), d x$ . The total area under the PDF must equal 1.

Expected Value, Variance, and Standard Deviation

The expected value (or mean) of a random variable, denoted $E (X)$ or $μ$ , is the long-run average value of the variable after many repetitions of the experiment. It represents the distribution's center of mass.

For a discrete variable: $E (X) = μ = \sum x \cdot P (X = x)$ . For a continuous variable: $E (X) = μ = \int_{- \infty}^{\infty} x \cdot f (x), d x$ .

While the expected value tells you the central tendency, the variance ( $Va r (X)$ or $σ^{2}$ ) and standard deviation ( $σ$ ) quantify the spread or dispersion of the distribution around the mean. They measure how much the values deviate from the expected value, on average.

For a discrete variable: $Va r (X) = σ^{2} = \sum (x - μ)^{2} \cdot P (X = x) = E (X^{2}) - [E (X)]^{2}$ The standard deviation is $σ = Va r (X)$ .

For a continuous variable: $Va r (X) = σ^{2} = \int_{- \infty}^{\infty} (x - μ)^{2} \cdot f (x), d x$

These parameters—mean and standard deviation—are the primary tools for describing and comparing different probability distributions.

The Binomial Distribution

The binomial distribution is a quintessential discrete distribution used to model situations with a fixed number of independent trials, each having two possible outcomes: success (with constant probability $p$ ) or failure (with probability $q = 1 - p$ ). It answers the question: "What is the probability of getting exactly $k$ successes in $n$ trials?"

A random variable $X \sim B (n, p)$ follows the binomial distribution if it meets these binomial conditions: a fixed number $n$ of identical trials, independence between trials, only two outcomes per trial, and a constant probability $p$ of success.

The probability mass function is given by: $P (X = k) = (k n) p^{k} (1 - p)^{n - k}, for k = 0, 1, 2, ..., n$ where $(k n) = \frac{n !}{k ! ( n - k )!}$ is the binomial coefficient.

Its parameters are directly calculated: $E (X) = n p$ and $Va r (X) = n p (1 - p)$ . For example, if you take a 10-question multiple-choice quiz by guessing (each with 4 choices), the number of correct answers is modeled by $X \sim B (10, 0.25)$ . The expected number correct is $E (X) = 10 \times 0.25 = 2.5$ .

On your GDC, you use the binomialPdf(n, p, k) function to find $P (X = k)$ and the binomialCdf(n, p, k) function to find $P (X \leq k)$ .

The Normal Distribution

The normal distribution is the most important continuous distribution, describing many natural phenomena like heights, test scores, and measurement errors. It is symmetric, bell-shaped, and completely defined by its mean ( $μ$ ) and standard deviation ( $σ$ ), denoted $X \sim N (μ, σ^{2})$ .

The PDF of the normal distribution is the familiar bell curve: $f (x) = \frac{1}{σ 2 π} e^{- \frac{1}{2} (\frac{x - μ}{σ})^{2}}$ While you won't integrate this manually, understanding its shape is key. The empirical rule (68-95-99.7 rule) states that approximately 68% of data lies within $\pm 1 σ$ of $μ$ , 95% within $\pm 2 σ$ , and 99.7% within $\pm 3 σ$ .

To find probabilities, you standardize a value to a z-score: $z = \frac{x - μ}{σ}$ . This transforms any normal distribution $N (μ, σ^{2})$ into the standard normal distribution $Z \sim N (0, 1)$ . You then use the normal CDF function on your GDC. For $X \sim N (100, 1 5^{2})$ , to find $P (X > 130)$ , first calculate $z = (130 - 100) /15 = 2$ . Then, use normalCdf(2, ∞) or its equivalent to find the area to the right. Your GDC can also perform inverse normal calculations: given a probability, it finds the corresponding $x$ or $z$ value.

Common Pitfalls

A major error is confusing discrete and continuous contexts. Remember, for a continuous distribution, $P (X = a) = 0$ . You must always calculate probability over an interval. Asking for the probability a train is exactly on time is meaningless; you should ask for the probability it arrives within a 1-minute window.

Another common mistake is misapplying the binomial distribution without verifying all four conditions. Dependence between trials or a changing probability $p$ invalidates the model. For instance, drawing cards without replacement is not binomial because the probability changes with each draw.

Students often forget to use the correct GDC function, confusing Pdf and Cdf. Use Pdf for the probability of a single, exact outcome in discrete distributions. Use Cdf for the probability of a range of outcomes ( $P (X \leq k)$ ). For the normal distribution, you almost exclusively use the normalCdf function to find areas (probabilities).

Finally, ensure you interpret expected value correctly. It is a long-term average, not necessarily a value that will occur in a single trial. An expected value of 2.5 correct answers on a quiz doesn't mean you can get a half-correct answer on one attempt; it means over many quizzes, your average would be 2.5.

Summary

A probability distribution describes all possible outcomes of a random variable and their associated probabilities, with discrete variables using a PMF and continuous variables using a PDF.
The expected value ( $E (X)$ ) is the distribution's long-run average, while variance and standard deviation quantify the spread of data around this mean.
The binomial distribution $B (n, p)$ models the count of successes in a fixed number of independent trials and has simple formulas for its mean and variance: $E (X) = n p$ , $Va r (X) = n p (1 - p)$ .
The normal distribution $N (μ, σ^{2})$ is a symmetric, continuous distribution defined by its mean and standard deviation; probabilities are found by calculating areas under the curve, typically using z-scores and GDC functions.
Success with these topics hinges on correctly identifying the type of random variable, selecting the appropriate distribution model, and using your GDC's Pdf, Cdf, and inverse functions accurately.

Probability Distributions and Expected Value

Probability Distributions and Expected Value

What is a Probability Distribution?

Expected Value, Variance, and Standard Deviation

The Binomial Distribution

The Normal Distribution

Common Pitfalls

Summary

Write better notes with AI