Statistical Methods Seminar Description
Overview: This 5-day summer camp will introduce participants to the R software platform for data analysis. R is a freely available, open source software platform that is growing in both popularity and capacity.
Participants in this camp will learn the principles of R-based data analysis and how to use a number of useful packages that are available in R. Integrated practice problems will fix ideas for attendees as they follow along with the lectures on their laptops.
- Introduction to the R ecosystem and the “open-source” philosophy
- Running R and installing third-party packages from CRAN
- Introduction to R’s user interface and syntactic structure
- Data structures and data management in R
- Simple programming in R
- Basic statistical analyses in R: Descriptives, regression, ANOVA, t-test
- Missing data analysis in R
- Using Quark for treating missing data
- Using lavaan for structural equation modeling
Instructor: Kyle M. Lang
Kyle is a senior research associate at the Texas Tech Institute for Measurement, Methodology, Analysis, and Policy. He earned his Ph.D. in quantitative psychology from the University of Kansas in 2015. Kyle’s research focuses on missing data analysis and Bayesian statistics with a particular emphasis on developing and evaluating multiple imputation techniques for use with difficult missing data problems (e.g., imputation in large datasets, high-dimensional imputation models). He also has extensive experience applying cutting edge statistical methods such as those for testing mediation and moderation to substantive research questions in fields such as psychology, education, social work, and political science as both a statistical consultant and a collaborating researcher. One common theme across all of Kyle’s research is high-performance statistical computing. He is the development team lead for the quark project—an R package that creates multiple imputations via principal components regression, and he regularly teaches classes and workshops on statistical programming with R. Kyle has been involved in Stats Camp every year since 2009. He has provided general statistical consulting for all of the courses offered, taught classes on Mediation and Moderation, Missing Data Analysis, and Statistical Programing with R, and given numerous guest lectures on topics such as mixture modeling, regularized regression modeling, and Bayesian structural equation modeling.
Software and Computer Support
A laptop with the latest version of R installed is highly recommended. R can be downloaded, for free, at the R-project website: https://www.r-project.org/. R’s built-in text editor is very primitive, so a stand-alone text editor with which to write R code and a plug-in allowing execution of R code directly from the text editor is highly recommended. The following are three text editor/plug-in combinations that work very well with R:
- RStudio (https://www.rstudio.com/)
- EMACS with ESS (http://vgoulet.act.ulaval.ca/en/emacs/)
- Notepad++ (https://notepad-plus-plus.org/) with NppToR (http://sourceforge.net/projects/npptor/)
The camp is ideal for life-science investigators, biostatisticians, program evaluators, and R & D researchers—anyone who is interested in data analysis with R.
Matloff, N. (2011). The art of R programming: A tour of statistical software design. San Francisco: No Starch Press.
Teetor, P. (2011). R cookbook. Sebastopol, CA: O’Reilly Media, Inc.
Venables, W. N., & Ripley, B. D. (2013). Modern applied statistics with S-PLUS. New York: Springer Science & Business Media.
Verzani, J. (2014). Using R for introductory statistics. Boca Raton, FL: CRC Press.