UK A-Level: Statistical Sampling

Collecting reliable data is the cornerstone of all statistical analysis. The methods you choose to gather this data directly determine the validity of your conclusions, making sampling one of the most critical concepts in your A-Level studies. This article breaks down the key techniques, from simple random selection to more structured approaches, and equips you with the skills to critically evaluate their use in real-world scenarios.

Understanding the Population, Census, and Sample

Before diving into methods, we must define our terms. The population is the entire group of individuals or items you are interested in studying. A census is an attempt to collect data from every member of the population. While this seems ideal, it is often impractical due to cost, time, or the sheer size of the population. Furthermore, a census can be logistically impossible if the measurement process destroys the item, like testing the lifetime of every lightbulb produced.

Consequently, we usually study a sample, which is a subset of the population. The goal is for this sample to be representative—a miniature version of the population that accurately reflects its characteristics. The process of selecting this sample is called sampling. The fundamental trade-off is clear: a census gives complete accuracy but is often unfeasible, while a sample is practical but introduces the risk that your subset may not perfectly mirror the whole group.

Core Probability Sampling Methods

Probability sampling methods are those where every member of the population has a known, non-zero chance of being selected. This allows for the calculation of sampling error and is the gold standard for statistical inference.

Simple Random Sampling (SRS) is the most basic form. Here, every possible sample of a given size has an equal chance of being selected. In practice, you number the population from 1 to $N$ and use a random number generator to select $n$ unique numbers. For example, to select 30 students from a school of 600, you would generate 30 random numbers between 1 and 600. Its strength is its simplicity and lack of inherent bias, but it can be cumbersome for large populations and may, by chance, under-represent certain subgroups.

Systematic Sampling offers a simpler alternative. After creating a numbered list of the population, you choose a random starting point and then select every $k$ -th item. The sampling interval $k$ is calculated as $k = N / n$ , rounded down. If you have a population of 500 and need a sample of 25, $k = 20$ . You pick a random number between 1 and 20 as your start, then select every 20th person thereafter. It is efficient, but you must ensure the list has no hidden periodic pattern that aligns with $k$ , which would create a biased sample.

Stratified Sampling is used when the population contains distinct, important subgroups, or strata (e.g., year groups, income brackets). You first divide the population into these strata. Then, you perform a simple random sample within each stratum. The proportion of the sample taken from each stratum can be proportional (matching the population proportions) or disproportional (to ensure adequate representation of a small group). This method guarantees representation from all key subgroups, improving precision and reducing sampling error for comparisons between strata.

Non-Probability Sampling Methods

These methods do not involve random selection from the entire population, meaning you cannot reliably calculate the probability of an individual being chosen. They are often used for convenience or when a sampling frame is unavailable.

Opportunity Sampling (or convenience sampling) involves selecting individuals who are easiest to access at the time. Surveying people in a shopping centre or using the first 20 students to enter a classroom are examples. While quick and cheap, this method is highly prone to bias, as the sample will not represent people who are not at that location at that time. Your results may only reflect the views of "shoppers on a Tuesday morning," not the wider population.

Bias and Its Effects on Data

Bias is a systematic error that causes your sample data to misrepresent the population in a consistent direction. It is not removed by taking a larger sample; a large biased sample is still wrong. Understanding common biases is key to evaluating any study.

Sampling Bias: Occurs when the sampling method systematically excludes or under-represents part of the population. Opportunity sampling is inherently biased.
Non-Response Bias: Arises when individuals chosen for the sample do not respond, and these non-respondents differ in a meaningful way from respondents.
Questionnaire Bias: Caused by poorly worded, leading, or ambiguous questions that influence the answers given.

Bias distorts results and undermines validity—the extent to which your findings reflect the true situation in the population. An unbiased, representative sample is essential for valid generalisation, where conclusions from the sample can be extended to the population.

Evaluating Sampling Method Appropriateness

Your A-Level exam will frequently present a scenario and ask you to justify or criticise a chosen sampling method. Your evaluation should follow a structured approach.

Identify the Population and Goal: What is being studied, and what is the research question?
Assess Practicality: Is a census possible? Is a full sampling frame (a list of all population members) available?
Judge Representativeness: Does the method ensure all relevant subgroups are included? For instance, if studying school-wide opinion, stratified sampling by year group is more appropriate than an opportunity sample from a single GCSE class.
Consider Bias: Does the method introduce known biases (e.g., location bias in opportunity sampling, pattern bias in systematic sampling)?
Weigh Cost vs. Accuracy: Simpler methods like opportunity sampling are cheap but inaccurate. More robust methods like stratified sampling are more complex but yield more reliable, generalisable results.

Example Scenario: "A researcher wants to estimate the average weekly spend on leisure activities by residents in a large town. They stand outside a cinema on a Friday evening and ask people to complete a questionnaire."

Critique: This is opportunity sampling. It is biased because it systematically excludes people who do not go to the cinema, people who are at home or elsewhere on Friday night, and those who are unwilling to stop. The sample will likely over-represent cinema-goers and their spending habits, limiting generalisation to the whole town.
Better Method: Use stratified random sampling of residential addresses (strata could be electoral wards or postcode areas) to ensure a geographic and likely socio-economic spread, or systematic sampling from the electoral register.

Common Pitfalls

Confusing 'Random' with 'Haphazard': Saying you "randomly asked people in the street" typically describes opportunity sampling, not simple random sampling. True random selection requires a defined list and a random number mechanism.
Overlooking Hidden Bias in Systematic Sampling: Failing to check that the list order (e.g., every 10th house on a street where houses are paired, or data listed as male/female) does not create a periodic pattern that aligns with the sampling interval.
Misunderstanding Stratification: Thinking that stratified sampling just means "dividing into groups." The crucial step is performing random sampling within each group. Simply taking convenient groups (like one whole class from each year) is a form of cluster sampling and is less robust.
Believing Larger Samples Fix Bias: A larger sample size reduces sampling error (the natural variation between samples) but does nothing to correct a fundamentally biased sampling method. A survey of 10,000 people outside a luxury car showroom still won't represent national car ownership.

Summary

A sample is a practical alternative to a census, with the paramount goal of being representative of the population.
Probability methods like Simple Random, Systematic, and Stratified Sampling allow for statistical inference. Stratified sampling is particularly powerful for ensuring representation of key subgroups.
Non-probability methods like Opportunity Sampling are prone to bias—a systematic error that cannot be fixed by increasing sample size and compromises the validity and generalisation of findings.
Always evaluate a sampling method by considering the population, practicality, risk of bias, and the trade-off between cost and accuracy.
In exams, justify your choice by directly linking the features of the method to the specific context of the scenario provided.

UK A-Level: Statistical Sampling

UK A-Level: Statistical Sampling

Understanding the Population, Census, and Sample

Core Probability Sampling Methods

Non-Probability Sampling Methods

Bias and Its Effects on Data

Evaluating Sampling Method Appropriateness

Common Pitfalls

Summary

Write better notes with AI