4:00pm to 6:00pm |
|
LISA Statistics Short Course: Data Analytics - Classification
(Research)
LISA SHORT COURSES IN STATISTICS
LISA (Virginia Tech's Laboratory for Interdisciplinary Statistical Analysis) is providing a series of evening short courses to help graduate students use statistics in their research. The focus of these two-hour courses is on teaching practical statistical techniques for analyzing or collecting data. See www.lisa.stat.vt.edu/?q=short_courses for instructions on how to REGISTER and to learn more.
Spring 2016 Schedule:
Tuesday, March 15, 4:00-6:00 pm: Comparing Means and Other Measures of Location between Two Populations by Significance Tests and Effect Size;
Tuesday, March 22, 4:00-6:00 pm: Data Analytics - Classification;
Tuesday, March 29, 4:00-6:00 pm: Basics of R;
Tuesday, April 5, 4:00-6:00 pm: Statistical Analysis Using R;
Tuesday, April 12, 4:00-6:00 pm: Better Data Visualization in R Using the ggplot2 Package;
Tuesday, April 19, 4:00-6:00 pm: Introduction to Web Scraping in R;
Tuesday, April 26, 4:00-6:00 pm: Introduction to Multivariate Analysis of Variance (MANOVA) in JMP;
Tuesday, March 22, 4:00-6:00 pm;
Location: 1100 Torgersen Hall;
Instructor: Lin Zhang;
Title: Data Analytics - Classification;
Data analytics (DA) is a science that combines data mining, machine learning, and statistics. DA examines raw data with the purpose of discovering useful information, suggesting conclusions, and supporting decision-making (source: https://en.wikipedia.org/wiki/Data_analysis). DA has become popular as big data problems have emerged in biological science, engineering, business, and other fields. There are many techniques that have been developed in data analytics. In this short course, we will focus on classification, or supervised learning techniques. These approaches include linear regression (least squares method), Bayes classifier, classification trees, logistic regression and LASSO logistic regression. We will first have a taste of the basic theory behind these techniques, and we will also discuss criteria used to evaluate classification, such as false positive, false negative, precision, and recall. Then we will use both simulated normal mixture data and the email spam data (https://archive.ics.uci.edu/ml/datasets/Spambase) to demonstrate how to use these classification techniques (e.g. Figure 1: LS classifier for the normal mixture data). Note: all the class demonstrations will be carried out in R.
www.lisa.stat.vt.edu/sites/default/files/images/2015-11-17-data-analytics-classification.png
Follow us on Facebook (www.facebook.com/Statistical.collaboration) or Twitter (www.twitter.com/LISA_VT) to be the first to know about LISA events! More information...
|