IB AA: Continuous Random Variables
AI-Generated Content
IB AA: Continuous Random Variables
Continuous random variables form the mathematical backbone for modeling uncountably infinite outcomes, such as time, distance, or temperature. Mastering them is crucial for IB Analysis & Approaches, as they bridge calculus with real-world probability and are a frequent exam topic requiring both conceptual understanding and computational skill.
Probability Density Functions: The Foundation
A probability density function (PDF), denoted , describes the relative likelihood of a continuous random variable taking on a specific value. Unlike discrete probability, where has meaning, for continuous variables, the probability at a single point is zero. Instead, probability is defined over intervals as the area under the PDF curve. Every valid PDF must satisfy two key properties. First, it must be non-negative for all : . Second, the total area under the curve must equal 1, representing certainty:
The process of ensuring the second property holds is called normalization. You will often be given a function proportional to a PDF, such as for , and must find the constant that makes it a valid PDF. You do this by setting the integral equal to 1: . Solving gives , so . Think of the PDF as a smooth histogram; its height shows density, not probability, and the total area of all bars sums to 1.
Cumulative Distribution Functions: From Density to Probability
The cumulative distribution function (CDF), denoted , gives the probability that is less than or equal to a specific value: . It is computed directly from the PDF by integration: The CDF is a non-decreasing function that ranges from 0 to 1, with and .
To find probabilities for an interval , you use the CDF: . The fundamental theorem of calculus shows the inverse relationship: the PDF is the derivative of the CDF, , wherever the derivative exists. For a worked example, consider the PDF for . Its CDF is for . To find , compute .
Expected Value and Variance: Measuring Center and Spread
The expected value (mean) of a continuous random variable, denoted or , is the long-run average outcome, weighted by probability density. It is calculated by integrating times the PDF over all possible values: Variance, denoted or , measures the spread or dispersion around the mean, defined as the expected value of the squared deviation: . A more computational formula is , where . The standard deviation is simply the square root of the variance: .
Let's compute these for the PDF on . First, the mean: Next, find : Thus, variance is .
Median and Mode: Other Measures of Central Tendency
For continuous variables, the median is the value such that half the probability lies below it and half above, satisfying . In terms of the CDF, you solve . Using our example CDF , set , so and . The mode is the value at which the PDF achieves its maximum. It represents the most likely outcome in a density sense. For on , the function increases linearly, so the maximum is at the right endpoint: . In symmetric distributions like the normal, the mean, median, and mode coincide.
Common Continuous Distributions and Real-World Modeling
The uniform distribution is the simplest continuous model, where all intervals of equal length have the same probability. If is uniform on , its PDF is constant: for . Its CDF is a straight line: . The mean is the midpoint , and variance is . Uniform distributions model scenarios like random number generation or waiting for a bus that arrives at fixed intervals.
Other essential continuous distributions include the exponential distribution for modeling waiting times or decay processes, and the normal distribution for bell-shaped data like heights or test scores. Connecting these to real-world modeling involves identifying the key characteristics of a phenomenon—such as whether it is memoryless (exponential) or symmetric (normal)—and selecting the appropriate PDF. For instance, the time between customer arrivals at a store might follow an exponential distribution, while errors in a physical measurement often follow a normal distribution. The power of continuous random variables lies in using integration over these PDFs to make precise probabilistic predictions about complex, measurable events.
Common Pitfalls
- Treating PDF value as a probability: A common error is interpreting as . Remember, for continuous variables, . Probability is always an area under the PDF curve, so you must integrate over an interval.
- Misapplying integration limits: When computing CDFs or expected values, ensure your limits match the support of the PDF. For a PDF defined only on , integrals for should run from 0 to , not from . Similarly, integrates from 0 to 2.
- Forgetting to normalize: If given a function like on an interval, you must first find by setting to obtain the valid PDF . Skipping this step leads to incorrect probabilities and expectations.
- Incorrectly finding the median: The median satisfies , not . Confusing the PDF and CDF here will yield the wrong value. Always use the CDF equation to solve for the median.
Summary
- A probability density function (PDF) must be non-negative and integrate to 1 over all space; it defines probabilities via area under the curve.
- The cumulative distribution function (CDF) is found by integrating the PDF and gives ; probabilities for intervals are differences in CDF values.
- Expected value and variance are calculated through integration: and .
- The median is the solution to , and the mode is the value maximizing .
- The uniform distribution has a constant PDF and is a foundational model; other distributions like exponential and normal extend modeling to various real-world scenarios.
- Always remember that for continuous random variables, probability is area, not height, and proper integration techniques are essential for accurate computation.