2024

NHST Summary Recipe

  1. Specify the null and alternative hypotheses (\(H_0\) and \(H_1\)) in terms of a population parameter \(\theta\).

  2. Specify the type I error rate – denoted by the symbol \(\alpha\) – you are willing to tolerate.

  3. Specify the sample statistic \(\widehat{\theta}\) that you will use to estimate the population parameter \(\theta\) in step 1 and state how it is distributed under the assumption that \(H_0\) is true.

  4. Obtain a random sample and use it to compute the sample statistic from step 3. Call this value \(\widehat{\theta}_{\text{obs}}\).

  5. If \(\widehat{\theta}_{\text{obs}}\) or a more extreme outcome is very unlikely to occur under the assumption that \(H_0\) is true, then reject \(H_0\). Otherwise, do not reject \(H_0\).

What is \(\theta\) and \(\widehat{theta}\)?

  • \(\theta\) is a population parameter that you are interested in estimating. E.g., in the case of a Normal test, \(\theta\) is the population mean \(\mu\).

  • \(\widehat{\theta}\) is a sample statistic that you use to estimate \(\theta\). E.g., in the case of a Normal test, \(\widehat{\theta}\) is the sample mean \(\overline{x}\).

  • The reason we bother to write the steps in terms of \(\theta\) and \(\widehat{\theta}\) is that the steps are general and can be applied to any hypothesis test.

How did we compute P-values before computers?

  • Statistical tables were the primary tool for finding p-values before computers.

  • Tables provided pre-calculated critical values and p-values for various distributions at common significance levels.

  • Relevant test statistic would be calculated by hand based on the sample data and then looked up in the appropriate table to get the corresponding p-value.

How many tables were there?

  • Consider just the Normal distribution defined by a mean \(\mu_X\) and standard deviation \(\sigma_X\).

  • There are infinitely many possible values for \(\mu_X\) and \(\sigma_X\), so there would be infinitely many tables.

  • In practice only the standard normal distribution was tabulated.

The Standard Normal (i.e., \(Z\)) Distribution

  • The standard normal distribution is a normal distribution with a mean of 0 and a standard deviation of 1.

  • Any normal distribution can be transformed into a standard normal distribution by subtracting the mean and dividing by the standard deviation.

The \(Z\) transformation

\[ \begin{align} X &\sim \mathcal{N}(\mu_X, \sigma_X) \\ Z &= \frac{X - \mu_X}{\sigma_X} \\ Z &\sim \mathcal{N}(0, 1) \end{align} \]

  • Here both \(X\) and \(Z\) are random variables.

The \(z\)-score

\[ \begin{align} z_{obs} &= \frac{x_{obs} - \mu_X}{\sigma_X} \\ z_{obs} &\sim \mathcal{N}(0, 1) \end{align} \]

  • Here, \(z_{obs}\) is a single value that represents the number of standard deviations \(x_{obs}\) – also a single value – is from the mean \(\mu_X\).

A quick note on nomenclature

  • We use capital letters to denote random variables and lower case letters to denote specific observations from those random variables.

  • We may also use subscripts (e.g., \(z_{obs}\)) to further clarifiy whether we are referencing a random variable or a specific observation.

  • Even if we neglect to perfectly follow this convention it will usually be clear from context which is which.

Question

Given a normal distribution with a mean (\(\mu_X\)) of 100 and a standard deviation (\(\sigma_X\)) of 15, you calculate a sample mean (\(\overline{x}\)) of 108 from a random sample of size \(n=10\). What is the \(z\)-score for this sample mean?

Click here for the answer \[ \begin{align} z_{obs} &= \frac{\overline{x} - \mu_{\overline{X}}}{\sigma_{\overline{X}}} \\ z_{obs} &= \frac{\overline{x} - \mu_{X}}{\sigma_{X} / n} \\ z &= \frac{108 - 100}{15/10}\\ z &= 5.33 \end{align} \]

Question

What is true about the \(Z\)-transformed distribution of and \(X\) and \(\overline{X}\)?

  • The \(Z\)-transformed distribution of \(X\) is normal with a mean of 0 and a standard deviation of 1 but the \(Z\)-transformed distribution of \(\overline{X}\) is not.

  • The \(Z\)-transformed distribution of \(\overline{X}\) is normal with a mean of 0 and a standard deviation of 1 but the \(Z\)-transformed distribution of \(X\) is not.

  • They are both normal with a mean of 0 and a standard deviation of 1.

  • Neither is normal with a mean of 0 and a standard deviation of 1.

Click here for the answer
  • They are both normal with a mean of 0 and a standard deviation of 1.