The Canberra Branch of the Statistical Society of Australia is pleased to offer the following 2- day short course:
Model Selection with R
Presented by Associate Professor Samuel Müller and Dr Garth Tarr
About the short course:
Statistical model building is a fundamental part of many statistical analyses. The aim is to use the data and, if available, information about its generating process, to construct statistical models which parsimoniously describe relevant and important features in the data. Arguably, the most widely used method for selecting variables is to minimise either Akaike’s information criterion (AIC) or the Bayesian information criterion (BIC) and their variants. However, AIC and BIC are not the only criteria of interest for the optimal selection of models. A major advance in the field is the Lasso and other related recent methods that use regularization. The Lasso and its extensions can handle data that involve more predictor variables than samples, the ‘large p, small n’ problem. The stability of selected components is paramount for reliable predictive final model(s), which can be achieved through stability paths, a product of repeated model selection on bootstrapped or cross-validated samples.
About the presenters:
Associate Professor Samuel Müller was born and educated in Switzerland and received his PhD in Mathematics from the University of Bern in November 2002. He joined the University of Sydney in 2008 and is currently an Associate Professor in the School of Mathematics and Statistics, Associate Dean in the Faculty of Science and also serves as President for the Australasian Region of The International Biometric Society (2017-18). He has supervised several PhD students to completion and is author of more than 50 publications. Samuel’s main research interests are in statistical model selection, applied statistics and robust methods. His current research is driven by his involvement as chief investigator in two ongoing Australian Research Council Discovery Project grants. The first grant focuses on developing new statistical methods and concepts for the modelling of structural and molecular data to better predict complex disease outcome. The second grant aims to develop advanced statistical modelling techniques for the analysis of correlated data, which are prevalent in population surveys, longitudinal and spatial studies.
Dr Garth Tarr received his PhD in Mathematical Statistics from the University of Sydney and has held postdoctoral positions at the University of Sydney and the Australian National University. He is currently a lecturer at the University of Newcastle. His diverse interests include robust statistics, data visualisation, model selection, econometric modelling, educational research, meat science and biostatistics. He has received a number of citations for his teaching, including a Vice-Chancellors Award for Teaching Excellence in 2016. Garth is an expert R user and has created several R packages, including the mplot package, and is a regular contributor to the Biometric Bulletin’s Software Corner.
This course has four sessions over two days, including both lectures style presentations and hands-on exercises:
Session 1: Linear regression using least squares. Introduction information criteria with focus on adjusted R-squared, AIC, BIC and Mallow’s Cp. Tests and stepwise procedures in R (also introduce lm(), summary(), plot(), and update()).
Session 2: Exhaustive searching with leaps and bestglm packages. Show that, when feasible, exhaustive searches outperform stepwise procedures, which are susceptible to local optimum.
Session 3: Ridge and Lasso as fast alternatives when exhaustive search is not possible, start introducing cross-validation and using glmnet.
Session 4: Stability in model selection, bootstrapping regression models, graphical means using mplot.
|Day 1||Day 2|
|9:15||Lecture 1||9:15||Lecture 3|
|10:30||Morning tea – provided||10:30||Morning tea – provided|
|11:00||Lab 1||11:00||Lab 3|
|1:15||Lecture 2||1:15||Lecture 4|
|2:30||Afternoon tea- provided||2:30||Afternoon tea – provided|
|3:00||Lab 2||3:00||Lab 4|
It will be assumed that participants are familiar with R and standard regression modelling techniques.
This short course focuses on model selection techniques for linear and generalised linear regression in two scenarios: when an extensive search of the model space is possible as well as when the dimension is large and either stepwise algorithms or regularisation techniques have to be employed to identify good models. We incorporate recent research on graphical tools for model choice and on how to tune regularisation procedures, such as the Lasso through resampling or model selection criteria.
The practical implementation of the discussed methods is an essential component of this course. Interactive labs will give participants the opportunity to apply what they have learnt. We will use the cross-platform, open-source software R, in particular the leaps, bestglm, glmnet and mplot packages.
Early Bird: Payment before or on 12 March 2017
SSA Members*: $500.00
SSA Non Members: $700.00
Full-time Students (Members/Non Members): $300.00 – Please email proof of full-time student status to [email protected]
Payment from 13 March 2017 – 4 April 2017
SSA Members*: $600.00
SSA Non Members: $750.00
Full-time Students (Members/Non Members): $350 – Please email proof of full-time student status to [email protected]
Registrations close strictly on 4 April 2017.
*Membership with the Society is available for $235.00(full member) or $20 (student member) for 12 months. To register as a member and benefit of the membership discount for this workshop as well as upcoming events, please click here.
The registration fees include workshop attendance and morning and afternoon tea. Lunch can be brought or purchased at one of the eateries on the ANU campus.
Occasionally workshops have to be cancelled due to a lack of subscription. Early registration by as many participants as possible ensures that this will not happen. Please contact the SSA Office before making any travel arrangements to confirm that the workshop will go ahead, because the Society will not be held responsible for any travel or accommodation expenses incurred due to a workshop cancellation.
Cancellations received prior to Monday, 3 April 2017, will be refunded, minus a $20 administration fee. From 3 April 2017 no part of the registration fee will be refunded. However, registrations are transferable within the same organisation. Please provide a copy of the registration confirmation sent to the original participant to the registration desk on arrival at the workshop.
|When:||10/04/2017 - 11/04/2017|
|Time:||9:15 am - 5:00 pm|
|Location:||ANU College of Business and Economics, Building 26C, Lecture Theatre 2, Ground Floor,