AP Statistics: Binomial Distribution
AI-Generated Content
AP Statistics: Binomial Distribution
The binomial distribution is the workhorse of probability for modeling scenarios where you count successes in a fixed series of trials, from quality control in manufacturing to clinical trial outcomes. Mastering it is crucial for AP Statistics because it provides the foundation for inference about proportions and connects directly to the normal model.
What Defines a Binomial Experiment?
A binomial setting arises when you perform a fixed number of independent trials and count how many times a well-defined event, called a "success," occurs. Before using any binomial tools, you must verify four conditions, often remembered with the acronym BINS.
- B – Binary Outcomes: Each trial must result in one of only two possible outcomes. These are universally labeled "success" and "failure," regardless of their real-world connotation. A success might be a defective part (in quality control) or a patient recovering (in medicine).
- I – Independent Trials: The outcome of one trial cannot influence the outcome of any other trial. This is often ensured by random sampling from a very large population or by proper experimental design, like using a random number generator.
- N – Fixed Number of Trials: The number of trials, denoted by , must be fixed in advance. You know you will flip the coin 10 times or sample 50 components.
- S – Constant Success Probability: The probability of success, denoted by , must remain constant for every single trial.
If these conditions are met, the random variable = the number of successes in trials has a binomial distribution. We write . For example, if you randomly select 15 students from a large school where 30% are left-handed, the count of left-handed students in your sample, , is approximately binomial with and . The independence condition is approximately met because the sample size is small relative to the population.
Calculating Binomial Probabilities
The probability of getting exactly successes in trials is given by the binomial probability formula:
Let's break down this formula with an example. Suppose a multiple-choice quiz has 5 questions, each with 4 choices. You guess randomly on every question. What is the probability you get exactly 3 correct?
Here, a "success" is a correct answer. We have trials (questions), probability of success per question, and we want .
- The Binomial Coefficient: (read as "n choose k") calculates the number of ways to arrange successes among trials. It's computed as . Here, . There are 10 different sequences of correct/incorrect answers that yield 3 correct responses.
- : This is the probability of the successes occurring: .
- : This is the probability of the remaining failures: .
Multiplying these together gives the probability for one specific sequence. We then multiply by the number of possible sequences:
So, there's about an 8.8% chance of guessing exactly 3 answers correctly. You will use your calculator's binompdf(n, p, k) function for these exact calculations on the AP exam, but understanding the formula is essential for interpreting your results.
Mean, Standard Deviation, and Shape
Like any distribution, a binomial distribution has a center and spread. These are derived directly from and .
- Mean (Expected Value): . This is intuitive: if you have 100 trials with a 20% success rate, you expect successes on average.
- Standard Deviation: . This measures the typical variation in the count of successes from one set of trials to another. For our guessing example, and .
The shape of a binomial distribution depends on . It is symmetric when , skewed right when , and skewed left when . As increases, the distribution becomes more symmetric and bell-shaped, which leads us to a powerful approximation.
The Normal Approximation to the Binomial
For large sample sizes, calculating exact binomial probabilities for ranges (e.g., ) can be tedious. Fortunately, when is sufficiently large, the binomial distribution can be approximated by a normal distribution with the same mean and standard deviation: .
The standard rule of thumb for when this approximation is appropriate is: This ensures the distribution is not too skewed.
Critical Step: Continuity Correction. Because we are approximating a discrete distribution (binomial) with a continuous one (normal), we must apply a continuity correction. We adjust the discrete value by 0.5 to find the corresponding area under the normal curve.
- For , use the normal area to the left of .
- For , use the normal area to the right of .
Example: Suppose a factory produces chips where 10% are defective. In a batch of 200 chips (), what is the approximate probability of finding 15 or fewer defective chips? First, check conditions: and , both . The approximating normal distribution is . We want . With continuity correction, we find . Calculate the z-score: . Using the standard normal table, . There is about a 14.5% chance of having 15 or fewer defective chips.
Common Pitfalls
- Forgetting to Check Independence: This is the most frequently violated condition. If you sample 20 people without replacement from a small class of 25, the trials are not independent. The binomial distribution does not apply; you need the hypergeometric distribution. The binomial is appropriate only when the population is at least 10 times the sample size.
- Misidentifying
nandp: Ensure is the probability of success for a single, well-defined trial. For example, if 70% of voters support a candidate and you poll 100 voters, (support is a success). Do not confuse with the probability you are solving for. - Misusing the Normal Approximation: Applying the normal model when or is less than 10 leads to inaccurate results. Also, omitting the continuity correction will introduce a systematic error, especially with smaller or probabilities near boundaries.
- Confusing
binompdfandbinomcdf: On your calculator,binompdf(n, p, k)computes (exactly ).binomcdf(n, p, k)computes (cumulative, or fewer). Using the wrong command will yield an answer to a different question.
Summary
- The binomial distribution models the count of successes in a fixed number of independent trials, each with constant success probability .
- Always verify the BINS conditions—Binary, Independent, Fixed Number, Constant Success—before using the binomial model.
- Calculate exact probabilities using the formula or your calculator's
binompdf/binomcdffunctions. - The distribution has mean and standard deviation .
- For large samples where and , you can approximate binomial probabilities with a normal distribution , remembering to apply a continuity correction of 0.5 when finding probabilities for a range of counts.