QUIZ 8: (questions are in italics).

This quiz is about 2-sample CIs for mu2-mu1, paired and/or unpaired.

Part I) Coverage of 95% CI for mu2-mu1

In the previous lab/qz you learned how to test the coverage of a CI. 
There, the CI was for the b parameter of Unif(0,b). Here, let's confirm
the coverage of the CI for mu2-mu1. To that end,

a) Write code to compute (and report) the coverage count for a 95% CI for mu2-mu1. Specifically,
- Let the first populations be N(1,2), and the second population be N(1.5,2).
- Let the two samples sizes be 100 and 110, respectively, and
- The number of trials (CIs) be 1000.
- Do NOT use the function t.test(), i.e., use the formulas we have developed in class.
- In those formulas, use the appropriate z* (not t*), which should be found in R, not by reading Table 1.
- Start your code with set.seed(1).
- Report the coverage count.
- Hint: revise the code from the previous lab.

   n1 = 100
   mu1 = 1
   sigma1 = 2
   n1 = 100
   mu2 = 1.5
   sigma2 = 2
   n2 = 110
   n.trial = 1000

   set.seed(1)
   CI = matrix(nrow=n.trial, ncol=2)
     for (i in 1:n.trial) {
     x1 = rnorm(n1, mu1, sigma1)
     x2 = rnorm(n2, mu2, sigma2)
     lower = (mean(x2) - mean(x1))-qnorm(.975)*sqrt(sigma1^2/n1 + sigma2^2/n2)
     upper = (mean(x2) - mean(x1))+qnorm(.975)*sqrt(sigma1^2/n1 + sigma2^2/n2)
     CI[i,] = c(lower,upper)
     }
       cnt = 0
       for (i in 1:n.trial) {
       if (CI[i,1] <= mu2-mu1 & CI[i,2] > mu2-mu1)   
       cnt = cnt+1
       }
       cnt    # 953 . If all is good the count should be around 950.

Part II: Paired and unpaired t-test on paired data.

Run the following code (which is something that showed-up in qz6, although that's not relevant to what we're doing here). It makes paired data on x1 and x2.

  library(MASS) # This library contains mvrnorm(); 
  set.seed(1)   
  n = 100
  r = 0.8
  dat = mvrnorm(n, c(0, 0.5), matrix(c(1, r, r, 1), 2, 2))
  x1 = dat[, 1]
  x2 = dat[, 2]

b) What is it about that data which qualifies it as being *paired*? Important: Provide a line of code that supports that claim. If you don't know the answer to this question, proceed to the next parts, first, and then return to this one.

  plot(x1,x2)    # the scatterplot shows an association. As discussed in the lecture, that's one of the signatures of paired data.

c) Use the function t.test() to report the 95% CI for mu2-mu1 for un-paired data.
  
  t.test(x2,x1)$conf.int              # 0.2238796 0.7282966

d) Use the function t.test() to report the 95% CI for mu2-mu1 for paired data.

  t.test(x2,x1,paired=T)$conf.int     # 0.3558811 0.5962951

e) Write code to compute the CI found in part d), but "by hand," i.e., without using the function t.test(), but using the formula we developed in lecture. Make sure to find the value of t* in R (not from Table 6).

  d = x2-x1
  lower = mean(d)-qt(.975, df=n-1)*sd(d)/sqrt(n)
  upper = mean(d)+qt(.975, df=n-1)*sd(d)/sqrt(n)
  c(lower, upper)                     # 0.3558811 0.5962951


Morals.
Part I): Again, the formulas for CIs that we have developed are ultimately 
designed to cover something (here mu2-mu1) some percentage of the time (here 95%).
Part II: IF the data are paired, the paired CI is generally narrower than the unpaired CI. Part e) simply confirms that t.test() does what our formulas say.