2025

Introduction to Confidence Intervals

  • Confidence intervals (CIs) provide a range of values that is likely to contain the population parameter of interest.

  • They estimate the precision of a sample statistic (e.g., the sample mean) as an estimator of a population parameter (e.g., the population mean).

  • If we were to take many samples and build a confidence interval from each sample, a certain percentage (e.g., 95%) of these intervals would contain the true population parameter.

An easy way to think of a confidence interval (two-tails)

  • A confidence interval provides a range of plausible values for a population parameter (like the true mean), based on the distribution of a test statistic (e.g., the sample mean).

  • In a two-tailed test, the CI is centered on the observed value of the test statistic, and its width depends on how much variability we estimate in that statistic based on our sample.

  • In the context of a two-tailed test, the critical values determine how far above and below the observed statistic we go to build the interval — they define the margin of error, not fixed endpoints.

Visualising a confidence interval

Confidence intervals are random variables

An easy way to think of a confidence interval (one-tail)

  • In a one-tailed test, the appropriate confidence interval is a one-sided interval that reflects the direction of the alternative hypothesis.

  • The interval is anchored at the observed statistic and extends in only one direction — above or below — depending on whether we’re testing for an increase or a decrease.

  • The entire alpha level (e.g., 5%) is placed in one tail of the sampling distribution, which shifts the cutoff and produces a longer interval in one direction.

Comparing One-Sided vs Two-Sided Confidence Intervals

CI interpretation

  • A 95% confidence interval means that if we repeated the sampling process many times, about 95% of the intervals we construct would contain the true population parameter.

  • The confidence level (e.g., 95%) refers to the method, not any one specific interval.

  • Once you’ve calculated a confidence interval from your sample, the true parameter is either inside it or not — there’s no probability anymore.

  • It’s a common mistake to say, “there’s a 95% chance that the true value is in this interval.”

  • In frequentist statistics, parameters are fixed — only the data (and intervals) vary across repeated samples.

How confidence intervals are formally calculated

  • \[\bar{x} \pm z \times \frac{s}{\sqrt{n}}\]
  • \(\bar{x}\) is the sample mean
  • \(z\) is the z-score from the standard normal distribution – \(\mathscr{N}(0, 1)\) – corresponding to the desired confidence level
  • \(s\) is the sample standard deviation
  • \(n\) is the sample size

Example: Calculating a 95% CI for a Mean

  • Let’s calculate a 95% confidence interval for the mean of a sample with the following characteristics:

  • Sample mean (\(\bar{x}\)): 100

  • Sample standard deviation (\(s\)): 15

  • Sample size (\(n\)): 30

Example: Calculating a 95% CI for a Mean

x_bar <- 100  # sample mean
s <- 15  # sample standard deviation
n <- 30  # sample size
z <- qnorm(0.975)  # z-score for 95% confidence

# Calculate the margin of error
margin_of_error <- z * (s / sqrt(n))

# Calculate the confidence interval
lower_bound <- x_bar - margin_of_error
upper_bound <- x_bar + margin_of_error

c(lower_bound, upper_bound)
## [1]  94.63242 105.36758