Statistics 581


Research Projects

This Web page contains information about the research projects that are part of the Stat 581 work. There are three project ideas, and you should be working in groups of 3-4, each group choosing one project. If you would prefer to work on a different project from those given below, you need to check with the instructor.

The results of the project should be presented orally (a 20 minute time slot will be available for each group; the presentations need to be ready on December 6) and in written form (write a short paper, of at most 10 pages, in a format suitable for a theoretical journal such as Annals of Statistics, Biometrika, or Journal of the American Statistical Association: Theory and Methods; the paper is due on December 11, although I will gladly look at drafts before that).

The general approach I would suggest for attacking any of these problems involves iteration between the following steps:

Project 1: Combination of 2x2 Tables

When testing for independence in a 2x2 table with relatively small expected frequencies, Yates (J. Roy. Soc., Suppl.1: 1934, pp.217 ff.) suggested to reduce the absolute value of each difference
n(i,j) - r(i)c(j)/n

by the quantity 1/2 before squaring and summing to calculate the usual chi-squared statistic. Under what circumstances does this improve the chi-squared approximation to the distribution of the test statistic? Suppose now that you have several tables, and want to perform an overall test of independence. Does that affect the quality of the approximation?

Project 2: Poisson approximations

The Poisson distribution with mean m is often introduced as the limit of binomial variates with parameters n and m/n. What can be said about the remainder in this approximation? Would a variance stabilizing transformation improve matters?

Project 3: Thermoluminescence dating

Thermoluminescence techniques have recently been applied to the dating of unheated Quaternary sediments with variable success. The event being dated is the last exposure to sunlight. A method called partial bleaching is often applied. A theoretical model for this procedure has Y = f(D,) ( 1 + ). For each sample there are two measurements, corresponding to different values of . The quantity of interest (which is related to the age of the sample) is the equivlent-dose value, i.e., the value D for which the two curves (corresponding to different values of ) intersect. A common model for the nonlinear function f is a saturated exponential, f(D,(a,b))=a(1+exp(-b(d+D))). How would you estimate the equivalent-dose value, and what is the uncertainty of your value?


Return to the Stat 581 home page.
Return to Peter Guttorp's home page.