SAMPLING - Theory & Formulas
š SAMPLING - Theory & Formulas
Cambridge AS & A Level Mathematics
š Part 1: Introduction to Sampling
Key Definitions
Population: Complete set of ALL items of interest
Sample: Part of the population (size = n)
Representative Sample: Accurately reflects population characteristics
Biased Sample: Does NOT properly represent population
Random Sample: ALL possible samples of size n have equal probability of selection
š” Why Use Samples?
| Reason | Example |
|---|---|
| š° Cost-Effective | Test 50 products vs 10,000 |
| ⏰ Time-Saving | Survey 100 people vs millions |
| šØ Destructive Testing | Crash testing helmets |
| š Impossible to Survey All | All fish in the ocean |
š² Random Sampling Methods
Using Random Number Tables:
- Number population: 000 to 499 (for 500 items)
- Pick starting point in table
- Read digits matching your numbering
- Ignore numbers outside range
- Ignore repeats
Using Excel:
=RAND() → Random number 0 to 1=INT(250*RAND())+1 → Random integer 1 to 250
⚠️ Types of Bias
| Type | Example |
|---|---|
| Location Bias | Survey only at gym about exercise |
| Time Bias | Survey Monday afternoon only |
| Leading Questions | "Don't you agree that...?" |
| Small Sample | Ask only 10 people |
š Part 2: Distribution of Sample Means
Sample Mean (X̄)
Definition: Average of all observations in sample
X̄ = (x₁ + x₂ + ... + xā) / n
⚡ Different samples → Different sample means!
š FUNDAMENTAL FORMULAS
1. Expected Value of Sample Mean
E(X̄) = μ
Sample mean equals population mean!
2. Variance of Sample Mean
Var(X̄) = ϲ / n
Variance decreases as sample size increases!
3. Standard Deviation of Sample Mean
SD(X̄) = Ļ / √n
Also called: Standard Error (SE)
⭐ THE CENTRAL LIMIT THEOREM (CLT)
Most Important Theorem in Statistics!
X̄ ~ N(μ, ϲ/n)
When n is large (usually n ≥ 30)
What it means:
- Sample means follow NORMAL distribution
- Mean = μ (population mean)
- Variance = ϲ/n
- Works EVEN IF original population is NOT normal!
š How Large Should n Be?
| Original Population | Minimum n |
|---|---|
| Normal Distribution | n ≥ 5 |
| Approximately Symmetric | n ≥ 20 |
| Skewed Distribution | n ≥ 30 |
| Any Distribution (Safe) | n ≥ 50 |
š Complete Formula Summary
| Concept | Formula |
|---|---|
| Population Mean | μ = E(X) |
| Population Variance | ϲ = Var(X) |
| Sample Mean | X̄ = Ī£xįµ¢ / n |
| Expected Value | E(X̄) = μ |
| Variance | Var(X̄) = ϲ/n |
| Standard Error | SE = Ļ/√n |
| Distribution (CLT) | X̄ ~ N(μ, ϲ/n) |
| Z-Score | Z = (X̄ - μ)/(Ļ/√n) |
š¢ Working with Sample Totals
If T = sample total of n observations:
T = n × X̄E(T) = nμVar(T) = nϲT ~ N(nμ, nϲ) when n is large
⚡ Continuity Correction
For DISCRETE distributions (Binomial, Poisson):
Continuity Correction = ± 1/(2n)
NOT ± 1/2
| Probability | Correction |
|---|---|
| P(X̄ < a) | P(X̄ < a - 1/(2n)) |
| P(X̄ ≤ a) | P(X̄ < a + 1/(2n)) |
| P(X̄ > a) | P(X̄ > a + 1/(2n)) |
| P(X̄ ≥ a) | P(X̄ > a - 1/(2n)) |
š Problem Solving Steps
Step 1: Identify μ, ϲ (or Ļ), n
Step 2: Check if CLT applies (n ≥ 30 or population normal)
Step 3: Write distribution: X̄ ~ N(μ, ϲ/n)
Step 4: Calculate SE: Ļ/√n
Step 5: Find Z-score: Z = (X̄ - μ)/(Ļ/√n)
Step 6: Use normal tables to find probability
Step 7: Apply continuity correction if discrete
š” Example 1: Pears in Bags
Problem: Pears: μ=45g, ϲ=52g², n=6. Find P(Total > 300g)
Solution: Total > 300g means X̄ > 50g
X̄ ~ N(45, 52/6) = N(45, 8.67)SE = √8.67 = 2.94Z = (50-45)/2.94 = 1.70P(Z > 1.70) = 1 - 0.9554 = 0.0446
Answer: 4.46%
š” Example 2: Water for Exercise
Problem: μ=500ml, Ļ=50ml, n=25. 13L available. Enough?
Solution: Need X̄ < 520ml (13000/25)
X̄ ~ N(500, 100) (ϲ/n = 2500/25)SE = 10Z = (520-500)/10 = 2.0P(Z < 2.0) = 0.9772
Answer: 97.72% probability
š” Example 3: Binomial (with Continuity Correction)
Problem: X ~ B(60, 0.25), n=50, Find P(X̄ ≤ 16)
Solution:
μ = 60×0.25 = 15ϲ = 60×0.25×0.75 = 11.25X̄ ~ N(15, 11.25/50) = N(15, 0.225)Correction: +1/(2×50) = +0.01P(X̄ ≤ 16) = P(X̄ < 16.01)Z = (16.01-15)/√0.225 = 2.13P(Z < 2.13) = 0.983
Answer: 98.3%
šÆ Quick Reference Card
| If you know... | You can find... |
|---|---|
| μ, ϲ, n | E(X̄) = μ, Var(X̄) = ϲ/n |
| Population normal | X̄ is normal for ANY n |
| n ≥ 30 | X̄ ~ N(μ, ϲ/n) by CLT |
| Discrete distribution | Use continuity correction ±1/(2n) |
| Sample total T | T ~ N(nμ, nϲ) |
✅ Key Takeaways
- Random sampling: Everyone has equal chance
- E(X̄) = μ: Sample mean targets population mean
- Var(X̄) = ϲ/n: Bigger sample = smaller variance
- CLT: X̄ is approximately normal when n is large
- Works for ANY distribution!
- Continuity correction: ±1/(2n) for discrete
