Class Activity, March 1
Part I: Wald test with Normal data
The purpose of this class activity is to check, with simulated data, whether the empirical type I error for a Wald test matches the theoretical type I error.
Suppose we observe data \(X_1,...,X_n \overset{iid}{\sim} N(\mu, \sigma^2)\). We wish to test \(H_0: \mu = \mu_0\) vs. \(H_A: \mu > \mu_0\), and we reject when \[Z_n = \frac{\sqrt{n}(\overline{X}_n - \mu_0)}{\sigma} > z_{\alpha}\]
Suppose that \(\mu_0 = 0\), \(n = 100\), \(\sigma = 1\), and \(\alpha = 0.05\).
- Run the following code to empirically evaluate the type I error, and confirm it is approximately 0.05 (the true type I error really is 0.05 here, but you will get a number slightly different from 0.05 since we are running the simulation a finite number of times).
<- 100
n <- 0
mu0 <- 1
sigma <- 0.05
alpha <- 5000
nreps <- rep(0, nreps)
test_stats for(i in 1:nreps){
<- rnorm(n, mu0, sigma)
x <- (mean(x) - mu0)/(sigma/sqrt(n))
test_stats[i]
}mean(test_stats > qnorm(0.05, lower.tail=F))
Repeat question 1 for \(n = 5, 10, 25, 50\). Plot the empirical type I error against \(n\).
Our test statistic so far assumes we know \(\sigma\). If we don’t know \(\sigma\), then we reject \(H_0\) when \[\frac{\sqrt{n}(\overline{X}_n - \mu_0)}{s} > z_{\alpha},\] where \(s = \sqrt{\frac{1}{n - 1} \sum \limits_{i=1}^n (X_i - \overline{X}_n)^2}\). Show through simulations that when we use \(s\) instead of \(\sigma\), type I error increases as \(n\) decreases.
Part II: t-tests
Suppose we observe data \(X_1,...,X_n \overset{iid}{\sim} N(\mu, \sigma^2)\). We wish to test \(H_0: \mu = \mu_0\) vs. \(H_A: \mu > \mu_0\), and we reject when \[t = \frac{\sqrt{n}(\overline{X}_n - \mu_0)}{s} > t_{n-1, \alpha}\] where \(t_{n-1, \alpha}\) is the upper \(\alpha\) quantile of a \(t_{n-1}\) distribution.
- Modify your code from question 3 to use the \(t_{n-1, 0.05}\) cutoff, and show in simulations that the empirical type I error is approximately 0.05 regardless of \(n\).