Sample Survey Techniques

 

STAT/CS&SS/BIOSTAT 529


 

Class meetings
Day and Time
Purpose
Location

Tuesday      10:30am-11:50am

Lecture

Padelford C-301

Thursday     10:30am-11:50am

Lecture/Laboratory

Padelford C-301

Professor

Mark S. Handcock, B-313 Padelford Hall, 543-6774

Office Hours

 

Thursdays

4:00pm - 5:00pm

 B-313 Padelford Hall

Other times by arrangement. Clearly composed questions

sent to the handcock@u will receive written replies

Laboratories

Some class sessions may be held in Social Science Computation and Research (CSSCR).

Introduction

This course will cover the statistical design and analysis of complex surveys, with applications in the social sciences and health sciences. This is an applied statistical methods course, focusing on the conceptual aspects of sampling rather than the mathematical. Implementation will rely on software. This is not a course in how to implement a survey (e.g., instrument construction) but one in the sampling theory that underlies complex surveys. Both design-based and model-based inference will be considered.

In addition to traditional topics in survey analysis we will cover data visualization, regression modeling of data from complex surveys, the design and analysis of two-phase samples from existing cohorts. We will not cover item-response theory.

Prerequisites

Students must have taken an graduate-level introductory course in applied statistics, and a regression modeling course is recommended. Knowledge of R would be very helpful.

Learning Objectives

After successfully completing this course, students should ordinarily expect to be able to:

  • Define a probability sample and explain its importance in statistics.
  • Describe situations where a probability sample can lead to greater precision that an attempt at complete enumeration.
  • Distinguish finite-population and superpopulation inference and give examples where each would be appropriate
  • Determine whether a survey design uses a probability sample.
  • Define common features of complex surveys:
    • strata,
    • clusters,
    • unequal sampling probabilties
    • and explain how they affect the cost of the survey and precision of estimates
  • Write down the Horvitz--Thompson estimator of the population total and explain to a non-statistician why it gives an unbiased estimate.
  • Compute summary statistics and fit regression models to data from complex surveys using R or Stata. Describe these analyses in language suitable for an academic paper in a health sciences or social sciences journal
  • Explain why assumptions about the distribution of data are not relevant to standard survey inference and what criteria are relevant for choosing summary statistics and models
  • Define post-stratification and raking, and explain how they can increase precision
  • Describe some strategies for mitigating the bias from non-response.
  • Explain the advantages and disadvantages of including sampling weights in a regression model
  • Describe case-cohort, case-control, two-phase case-control, and countermatching designs for sampling from a cohort, and how data from these design can be analysed.

This course is part of the curriculum of the Center for Statistics and the Social Sciences (CSSS), with funding from the University Initiatives Fund. The CSSS is includes faculty members from the Department of Statistics and a broad-range of social science disciplines including Anthropology, Economics, Geography, Political Science, and Sociology. This curriculum is been developed to complement and strengthen the quantitative methods course offerings for social science students at both the undergraduate and graduate levels.

Structure of the Course

There will be a two lectures per week. The lecture on Thursday may occasionally be a laboratory session.

Textbooks

[CS] Thomas Lumley “Complex Surveys: A Guide to Analysis using R.” (2009). Wiley : Hoboken. Required (password protected preprint-download only).

Mailing list and Discussion Forum

I will be using a mailing list to provide discussion of issues in class and related questions. For questions that might be of interest to other students, please use the mailing list rather than solely emailing me. Example of questions are about interesting articles you have seen in the media, problems with access to resources, homework or computer questions. Enjoy!

Please regularly consult this class home page and archive of the mailing list. It will contain lecture notes, homework, solutions and course information.

Computer Usage and Software

The computer is the scientific laboratory of the applied researcher in quantitative fields. As such this course requires the student to develop a degree of comfort and competence “in the lab”. If you want more background consult the lecture notes in CSSS 505.
A good resource for those new to R is:
Introduction to R
When: Friday, April 3 from 2:30-5:00 PM
Where: Condon Hall Room 601C (large CSSCR computer lab)
Taught by Cori Mar

Course Requirements and Grades

There will be weekly homeworks and exercises both the theory and real data analysis. Students will be graded on a scale of 1 to 10 for each homework. This will be 40% of the grade.

Discussion of homework problems is encouraged. However, each student is required to prepare and submit solutions (including computer work) to the assignments and project on their own; solutions prepared “in committee” are not acceptable. Duplication of homework solutions and computer output prepared in whole or in part by someone else is not acceptable and is considered plagiarism.  If you receive assistance from anyone, you must give due credit in your report.  (Example: “Since the data are all positive, and skewed to the right, a logarithmic transformation is clearly appropriate as a next step.  I thank David Cox for pointing this out to me.”)

There will be a mid-term exam worth 30% of the grade.

There will be a term project worth 40% of the grade.

I welcome comments or suggestions about the course at any time, either in person, by letter, or by anonymous email. Please feel free to use these ways make comments to me about any aspect of the course.


Use the menu on the top-left of this page to find out more about the course.


STUDENTS WITH DISABILITIES

If you have a disability that requires special testing accommodations or other classroom modifications you need to notify the instructor and the Office of Disabled Student Services as soon as possible. You may contact the DSS office at 543-8925.

About this web site


UW - CSSS

Contact: Webmaster or CSSS