LISA Statistics Short Course: Multiple Imputation and Missing Data
LISA SHORT COURSES IN STATISTICS
LISA (Virginia Tech's Laboratory for Interdisciplinary Statistical Analysis) is providing a series of evening short courses to help graduate students use statistics in their research. The focus of these two-hour courses is on teaching practical statistical techniques for analyzing or collecting data. See www.lisa.stat.vt.edu/?q=short_courses for instructions on how to REGISTER and to learn more.
Spring 2014 Schedule:
Tuesday & Thursday, February 11 & 13: Basics of R;*
Tuesday & Thursday, February 18 & 20: Statistical Analysis in R;*
Tuesday & Thursday, February 25 & 27: Graphics in R;*
Tuesday & Wednesday, March 4 & 5: Introduction to JMP;*
Tuesday, March 18: Advanced Topics in R: parallel processing, structural equation modeling, and the bootstrap;
Tuesday, April 1: Survey Design and Analysis;
Tuesday, April 10: Accelerating statistical calculations using inexpensive graphics cards;
Tuesday, April 15: Multiple Imputation and Missing Data;
*Two sessions of the same course to accommodate more attendees.
Tuesday, April 15;
Instructor: Jon Atwood;
Title: Multiple Imputation and Missing Data;
Missing data can plague researchers in many scenarios, arising from incomplete surveys, experimental objects broken or destroyed, or data collection/computational errors. This short course will explore what missing data is and where it comes from, as well as how to deal with it effectively. First, we will explore the concepts of "missing completely at random", "missing at random", and "not missing at random", learning the differences between these three, how to know which one fits a particular data set, and how this classification will affect our procedures for dealing with the missing data. Next, we will briefly cover early methods for handling missing data, such as complete case analysis and single imputation techniques (mean, hot deck, etc.), and why in practice they can produce inefficient results. Finally, we will learn the basics of multiple imputation and how to apply them to some real-world data sets.
We will use SAS to perform the imputation methods, and the instructor will explain the code. Basic SAS knowledge will be helpful but is not required. Some basic probability and Bayesian knowledge may also be helpful. We will use a San Francisco data set attempting to predict household income from demographic information. The data set may be found at the link below but will be provided in the course.
Follow us on Facebook (www.facebook.com/Statistical.collaboration) or Twitter (www.twitter.com/LISA_VT) to be the first to know about LISA events!