The z-Test

2024

NHST Summary Recipe

Specify the null and alternative hypotheses (\(H_0\) and \(H_1\)) in terms of a population parameter \(\theta\).
Specify the type I error rate – denoted by the symbol \(\alpha\) – you are willing to tolerate.
Specify the sample statistic \(\widehat{\theta}\) that you will use to estimate the population parameter \(\theta\) in step 1 and state how it is distributed under the assumption that \(H_0\) is true.
Obtain a random sample and use it to compute the sample statistic from step 3. Call this value \(\widehat{\theta}_{\text{obs}}\).
If \(\widehat{\theta}_{\text{obs}}\) or a more extreme outcome is very unlikely to occur under the assumption that \(H_0\) is true, then reject \(H_0\). Otherwise, do not reject \(H_0\).

What is \(\theta\) and \(\widehat{theta}\)?

\(\theta\) is a population parameter that you are interested in estimating. E.g., in the case of a Normal test, \(\theta\) is the population mean \(\mu\).
\(\widehat{\theta}\) is a sample statistic that you use to estimate \(\theta\). E.g., in the case of a Normal test, \(\widehat{\theta}\) is the sample mean \(\overline{x}\).
The reason we bother to write the steps in terms of \(\theta\) and \(\widehat{\theta}\) is that the steps are general and can be applied to any hypothesis test.

How did we compute P-values before computers?

Statistical tables were the primary tool for finding p-values before computers.
Tables provided pre-calculated critical values and p-values for various distributions at common significance levels.
Relevant test statistic would be calculated by hand based on the sample data and then looked up in the appropriate table to get the corresponding p-value.

How many tables were there?

Consider just the Normal distribution defined by a mean \(\mu_X\) and standard deviation \(\sigma_X\).
There are infinitely many possible values for \(\mu_X\) and \(\sigma_X\), so there would be infinitely many tables.
In practice only the standard normal distribution was tabulated.

The Standard Normal (i.e., \(Z\)) Distribution

The standard normal distribution is a normal distribution with a mean of 0 and a standard deviation of 1.
Any normal distribution can be transformed into a standard normal distribution by subtracting the mean and dividing by the standard deviation.

The \(Z\) transformation

\[ \begin{align} X &\sim \mathcal{N}(\mu_X, \sigma_X) \\ Z &= \frac{X - \mu_X}{\sigma_X} \\ Z &\sim \mathcal{N}(0, 1) \end{align} \]

Here both \(X\) and \(Z\) are random variables.

The \(z\)-score

\[ \begin{align} z_{obs} &= \frac{x_{obs} - \mu_X}{\sigma_X} \\ z_{obs} &\sim \mathcal{N}(0, 1) \end{align} \]

Here, \(z_{obs}\) is a single value that represents the number of standard deviations \(x_{obs}\) – also a single value – is from the mean \(\mu_X\).

A quick note on nomenclature

We use capital letters to denote random variables and lower case letters to denote specific observations from those random variables.
We may also use subscripts (e.g., \(z_{obs}\)) to further clarifiy whether we are referencing a random variable or a specific observation.
Even if we neglect to perfectly follow this convention it will usually be clear from context which is which.

Question

Given a normal distribution with a mean (\(\mu_X\)) of 100 and a standard deviation (\(\sigma_X\)) of 15, you calculate a sample mean (\(\overline{x}\)) of 108 from a random sample of size \(n=10\). What is the \(z\)-score for this sample mean?

Click here for the answer

\[ \begin{align} z_{obs} &= \frac{\overline{x} - \mu_{\overline{X}}}{\sigma_{\overline{X}}} \\ z_{obs} &= \frac{\overline{x} - \mu_{X}}{\sigma_{X} / n} \\ z &= \frac{108 - 100}{15/10}\\ z &= 5.33 \end{align} \]

Question

What is true about the \(Z\)-transformed distribution of and \(X\) and \(\overline{X}\)?

The \(Z\)-transformed distribution of \(X\) is normal with a mean of 0 and a standard deviation of 1 but the \(Z\)-transformed distribution of \(\overline{X}\) is not.
The \(Z\)-transformed distribution of \(\overline{X}\) is normal with a mean of 0 and a standard deviation of 1 but the \(Z\)-transformed distribution of \(X\) is not.
They are both normal with a mean of 0 and a standard deviation of 1.
Neither is normal with a mean of 0 and a standard deviation of 1.

Click here for the answer

They are both normal with a mean of 0 and a standard deviation of 1.