The Normal test requires that we know \(\sigma_X\).
In practice, we must almost always estimate \(\sigma_X\) from the data.
\(s_X\) is a good estimator of \(\sigma_X\)
2024
The Normal test requires that we know \(\sigma_X\).
In practice, we must almost always estimate \(\sigma_X\) from the data.
\(s_X\) is a good estimator of \(\sigma_X\)
\[ \begin{align} s^2_X &= \frac{1}{n-1} \sum_{i=1}^{n} (X_i - \overline{X})^2 \\\\ s_X &= \sqrt{\frac{1}{n-1} \sum_{i=1}^{n} (X_i - \overline{X})^2} \\\\ \widehat{\sigma_X} &= s_X \end{align} \]
\[ \begin{align} X &\sim \mathcal{N}(\mu_X, \sigma_X) \\ Z &= \frac{X - \mu_X}{\sigma_X} \\ Z &\sim \mathcal{N}(0, 1) \\\\ X &\sim \mathcal{N}(\mu_X, \sigma_X) \\ t &= \frac{X - \mu_X}{\widehat{\sigma_X}} \\ t &\sim t(df) \\ df &= n - 1 \end{align} \]
We mostly use capital letters for random variables and lower case letters for observed values sampled from a random variable.
E.g. \(X\) is a random variable, \(x\) is an observed value sampled from \(X\).
However, lower case \(t\) is used for both the random variable and the observed value.
I will try to use \(t_{obs}\) to denote the observed value of the test statistic but you should always try to make sense of things from context.
The t has heavier tails than the Normal.
Increasing \(df\) makes the t look more like the Normal.
Higher tails \(\rightarrow\) larger p-values \(\rightarrow\) less evidence against \(H_0\).
Use the pt
function in R to compute p-values.
Higher tails \(\rightarrow\) larger p-values \(\rightarrow\) less evidence against \(H_0\).
Use the qt
function in R to compute critical values.
Suppose we are interested in testing whether a neuron increases its firing rate in response to a peripheral stimulus.
Let \(X \sim \mathcal{N}(\mu_X, \sigma_X)\) be the random variable that generates the firing rate of the neuron.
\[ \begin{align} H_0: &\ \mu_X = 10 \\ H_1: &\ \mu_X > 10 \\[2ex] \alpha &= 0.05 \\[2ex] t_{obs} &= \frac{\overline{x}_{obs} - \mu_{\overline{X}_{H_0}}}{\widehat{\sigma}_\overline{X}} \\ t_{obs} &= \frac{\overline{x}_{obs} - \mu_{X_{H_0}}}{\widehat{\sigma}_X / \sqrt{n}} \\ &\sim t(df=n-1) \end{align} \]
## [1] 13.448164 11.719628 11.801543 11.221365 9.888318 14.573826 11.995701 ## [8] 7.066766 12.402712 10.054417
t_obs <- (mean(x_obs) - mu_x) / (sd(x_obs) / sqrt(n))
p_value <- pt(t_obs, df=n-1, lower.tail=F) t_crit <- qt(0.05, df=n-1, lower.tail=F)
if (p_value < 0.05) { decision <- "Reject H0" } else { decision <- "Fail to reject H0" } decision
## [1] "Reject H0"
# suppose you observe the following firing rates x_obs <- rnorm(10, mean=10+1, sd=2) res <- t.test(x=x_obs, y=NULL, alternative="two.sided", mu=10, paired=F, ## doesn't matter with one-sample var.equal=F, ## doesn't matter with one-sample conf.level=0.95)
t.test
function:## ## One Sample t-test ## ## data: x_obs ## t = 2.1587, df = 9, p-value = 0.0592 ## alternative hypothesis: true mean is not equal to 10 ## 95 percent confidence interval: ## 9.932058 12.902430 ## sample estimates: ## mean of x ## 11.41724
Suppose that the firing rate of a neuron is measured during peripheral stimulation 100 times and a t-test is conducted to assess whether or not the true firing rate is greater than 10. Let \(X\) be the random variable that generates firing rates for this neuron. We do not know the true variance of \(X\). The raw data is stored in a variable named x_obs
. Which of the following correctly computes the observed t-statistic?
t_obs <- (mean(x_obs) - 10) / (sd(x_obs) / sqrt(100))
t_obs <- (mean(x_obs) - 10) / (sd(x_obs) / sqrt(99))
t_obs <- (x_obs - 10) / sd(x_obs)
t_obs <- (mean(x_obs) - 10) / (sd(x_obs) / sqrt(100))
## ## One Sample t-test ## ## data: x_obs ## t = 1.9052, df = 9, p-value = 0.04457 ## alternative hypothesis: true mean is greater than 10 ## 95 percent confidence interval: ## 10.04347 Inf ## sample estimates: ## mean of x ## 11.14925
Which of the following is false?
\[ \begin{align} &1. t_{obs} = \frac{\overline{x}_{obs} - \mu_{X_{H_0}}}{\widehat{\sigma}_X / \sqrt{n}} \\ &2. t_{obs} = \frac{\overline{x}_{obs} - \mu_{\overline{X}_{H_0}}}{\widehat{\sigma}_X / \sqrt{n}} \\ &3. t_{obs} = \frac{\overline{x}_{obs} - \mu_{\overline{X}_{H_0}}}{\widehat{\sigma}_{\overline{X}} / \sqrt{n}} \\ &4. t_{obs} = \frac{\overline{x}_{obs} - \mu_{\overline{X}_{H_0}}}{\widehat{\sigma}_{\overline{X}}} \\ \end{align} \]