BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Stats Camp Statistics Course - ECPv6.15.20//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-ORIGINAL-URL:https://www.statscamp.org
X-WR-CALDESC:Events for Stats Camp Statistics Course
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:20220313T090000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:20221106T080000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:20230312T090000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:20231105T080000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:20240310T090000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:20241103T080000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/Denver:20230612T090000
DTEND;TZID=America/Denver:20230616T170000
DTSTAMP:20260409T211136
CREATED:20220701T090804Z
LAST-MODIFIED:20240222T203237Z
UID:2690-1686560400-1686934800@www.statscamp.org
SUMMARY:Intro to Data Mining and Machine Learning
DESCRIPTION:IN PERSON – 5-day Statistics Short Course\nDOWNLOAD SAMPLE COURSE SLIDES AND WATCH COURSE VIDEO PREVIEW\nData Mining and Machine Learning Seminar Overview:\nAn intermediate 5-day course introducing several popular machine learning approaches such as regression based methods (ridge and lasso regularized regression\, regression splines)\, tree methods (random forests\, boosted trees)\, support vector machines\, and Interpretative Machine Learning (ILM) as well as their application to empirical data. The course combines lectures and hands-on practice using R. \nSeminar Topics:\n\nReview of linear regression and the least squares criterion\nRegularization methods (ridge regression\, lasso\, elastic net)\nRegression splines\nPrediction error and k-fold cross validation\nTree methods to predict categorical or continuous outcomes (CART\, random forest\, boosting\nInterpretative Machine Learning (IML)\nSupport vector machines for classification\n\nSeminar Description:\nMachine learning refers to leveraging data to build statistical models or algorithms. The objective is usually to gain knowledge about the structure in the data in order to make predictions or decisions. \nThis short course is based on \n\nAn Introduction to Statistical Learning (James\, Witten\, Hastie\, Tibshirani)\nHands-on Machine Learning with R (Boemke & Greenwell)\nInterpretable Machine Learning (Molnar)\n\nThe course starts with briefly outlining the key differences and similarities between standard parametric modeling (e.g.\, linear regression\, structural equation modeling) and machine learning (aka statistical learning\, aka data mining). The course provides basic insights into a number of popular methods such as regression methods (ridge regression and the lasso\, regression splines)\, tree methods (CART\, random forests\, boosting)\, interpretable machine learning (IML)\, and support vector machines. The emphasis is on a conceptual understanding of these methods and their appropriate application to empirical data. Importantly\, these methods are useful not only for large data collections\, but also more generally for exploratory analyses when the substantive theory to design and fit parametric models (e.g. SEM) is lacking. Machine learning is used in a wide variety of fields including but not limited to public health\, education\, biology\, and the different social sciences. \nParticipants are invited to discuss potential machine learning applications to their data during individual consultations with the instructor scheduled at the end of days 2-5. \nParticipants will receive an electronic copy of all course materials\, including lecture slides\, practice datasets\, software scripts (R)\, relevant supporting documentation\, and recommended readings. Participants will also have access to a video recording of the course. \n  \n\nInstructor: Gitta Lubke\, Ph.D.\n \nGitta Lubke is a Professor Emerita in the Department of Psychology/Quantitative Area at the University of Notre Dame. Her research interests included machine learning and general latent variable modeling. Empirical applications were mainly in the field of psychiatric disorders and behavioral genetics. Other areas of expertise include mixture models\, twin models\, multi-group factor analysis and measurement invariance\, longitudinal analyses\, and the analysis of categorical data. \nAPA Continuing Education Credits:\n \nThis course offers 29 hours of Continuing Education Credits. Stats Camp Foundation is approved by the American Psychological Association to sponsor continuing education for psychologists. Stats Camp Foundation maintains responsibility for this program and its content. \nSeminar Includes:\nMaterials\, downloads\, recorded course video viewable for up to one year. \n\n\n\n\nLearning Objectives\nLearning Objectives:\nAfter engaging in course lectures and discussions as well as completing the hands-on practice activities with real data\, participants will be able to:\n\nUnderstand some of the key differences and similarities between parametric modeling and machine learning methods.\nExpand the acquired basic knowledge of several popular machine learning methods and apply these methods to empirical data.\nImplement ridge regression and Lasso.\nAssess and interpret the results of empirical analyses through k-fold cross validation and computation of prediction errors.\nImplement and evaluate regression splines.\nImplement and evaluate decision trees to categorize data.\nUtilize CART and bagging techniques.\nImplement random forests to evaluate data.\nImplement and evaluate boosted trees.\nUnderstand several basic interpretable machine learning methods\nUnderstand and utilize support vector machines\nUtilize R packages for machine learning.\nUnderstand and evaluate scientific papers covering basic machine learning applications to empirical data.\n\nSeminar Prerequisites\nSeminar Prerequisites:\nRequired: \n\nAdvanced proficiency in linear regression\, including the estimation of regression coefficients using least squares\nIntermediate proficiency with R\nIntermediate knowledge of exploratory data analysis\nBasic familiarity with iterative optimization (e.g. Newton-Raphson algorithm to find a maximum)\n\nNot required but advantageous: \n\nExperience in calculus (e.g.\, graduate-level course)\nUnderstanding the relation between multiple testing and Type I error\n\nNo level of proficiency beyond basic awareness is assumed for skills related to: \n\nMachine learning methods\nMore advanced mathematical or statistical topics such as constrained estimation using Laplace multipliers\n\nSoftware and Computer Support\nSoftware and Computer Support:\nParticipants need to bring a laptop computer with Wi-Fi capabilities. \nAll statistical software used at Stats Camp will be available\, free to participants\, on our SMORS (statistical modeling on remote servers) system for the duration of camp. \nAll instruction for this course will be based on the freely available software program R. Please make sure to have a recent version installed. \nSeminar Audience\nSeminar Audience:\nTypically the ideal audience for this course in data mining and machine learning includes: \n\nStudents pursuing a degree in computer science\, engineering\, statistics\, or related fields who have a strong background in mathematics and programming.\nResearchers and professionals in the fields of data science\, data analysis\, artificial intelligence\, and machine learning who want to learn new techniques and keep up with the latest developments in the field.\nData analysts and data engineers who are interested in learning how to extract insights from large datasets using machine learning algorithms.\nBusiness professionals who are interested in understanding how data mining and machine learning can be applied to solve real-world business problems.\nAnyone who wants to gain a deeper understanding of the techniques and algorithms used in data mining and machine learning\, and their applications in various fields.\n\nThe audience for a course in data mining and machine learning can be quite diverse\, but typically consists of individuals with a strong background in quantitative analysis and a desire to apply machine learning techniques to real-world problems. \nCourse Learning Goals\nCourse Learning Goals:\nAfter engaging in course lectures and discussions as well as completing the hands-on practice activities with empirical data\, participants will be able to: \n\nUnderstand some of the key differences and similarities between parametric modeling and data mining methods\nExpand the acquired basic knowledge of several popular data mining methods and apply these methods to empirical data\nAssess and interpret the results of empirical analyses through k-fold cross validation and computation of prediction errors\nUtilize R packages for data mining\nUnderstand and evaluate scientific papers covering data mining applications to empirical data\n\nSeminar Files\nSeminar Files\nSeminar files will be provided by the instructor on the first day of the seminar. You do not need to download anything prior to the event date. All materials will be provided during or after the class. \nAll statistical software used at Stats Camp will be available\, free to participants\, on our SMORS (statistical modeling on remote servers) system for the duration of camp.Syllabus\n\n\n\nMonday\nJune 12\, 2023\n\n\n9:00-9:30\nWelcome and introductions\n\n\n9:30-10:45\nSimple and Multiple Linear Regression\n\n\n10:45-11:00\nRest Break\n\n\n11:00-12:30\nRidge Regression and Lasso\n\n\n12:30-1:30\nRest Break\n\n\n1:30-3:00\nPrediction Error and Cross Validation\n\n\n3:00-3:15\nRest Break\n\n\n3:15-5:00\nR lab: Ridge Regression and Lasso\n\n\nTuesday\nJune 13\, 2023\n\n\n9:00-10:45\nRegression Splines\n\n\n10:45-11:00\nRest Break\n\n\n11:00-12:30\nR lab: Regression Splines\n\n\n12:30-1:30\nRest Break\n\n\n1:30-3:00\nIntroduction to Tree Methods\n\n\n3:00-3:15\nRest Break\n\n\n3:15-5:00\nIndividual consultation with instructor\n\n\nWednesday\nJune 14\, 2023\n\n\n9:00-10:45\nCART\, bagging\, Random Forests\n\n\n10:45-11:00\nRest Break\n\n\n11:00-12:30\nR lab: Random Forests\n\n\n12:30-1:30\nRest Break\n\n\n1:30-3:00\nBoosted Trees\n\n\n3:00-3:15\nRest Break\n\n\n3:15-5:00\nIndividual consultation with instructor\n\n\nThursday\nJune 15\, 2023\n\n\n9:00-10:45\nR lab: Boosted Trees\n\n\n10:45-11:00\nRest Break\n\n\n11:00-12:30\nInterpretable Machine Learning (IML)\n\n\n12:30-1:30\nRest Break\n\n\n1:30-3:00\nDeductive Data Mining\n\n\n3:00-3:15\nRest Break\n\n\n3:15-5:00\nIndividual consultation with instructor\n\n\nFriday\nJune 16\, 2023\n\n\n9:00-10:45\nSupport Vector Machines\n\n\n10:45-11:00\nRest Break\n\n\n11:00-12:30\nR lab: Support Vector Machines\n\n\n12:30-1:30\nRest Break\n\n\n1:30-3:15\nIndividual consultation with instructor\n\n\n\n\n\n\n\n\n\n\n\nDownload Sample Slides and Preview Course Video\nPlease fill out and submit the form below to get instant access to sample course materials.
URL:https://www.statscamp.org/courses/intro-to-data-mining-and-machine-learning/
LOCATION:Embassy Suites – Albuquerque\, If you are unavailable to join in-person\, you can participate asynchronously by viewing the recorded course videos for up to 1 year.
CATEGORIES:Intro to Data Mining and Machine Learning,Summer Camp
ATTACH;FMTTYPE=image/jpeg:https://www.statscamp.org/wp-content/uploads/2022/07/data-mining-and-machine-learning-statistics-course.jpg
END:VEVENT
END:VCALENDAR