SSA Vic Meetings
Meetings will be held in the Russell Love Theatre, Richard Berry Building, The University of Melbourne unless otherwise indicated.
Free car parking is not available on campus. There is easy access by public transport. Information about car parking at the University can be found at:http://www.pb.unimelb.edu.au/parking/whereyoushouldpark/visitorparking.html
A map of the Parkville campus can be found at:http://www.pb.unimelb.edu.au/CampusMaps/Parkville.pdf
VIC Branch AGM 2018
Date & Time: March 27.
Branch AGM to take place in the Peter Hall Building Tea Room at the University of Melbourne at 5pm; followed by the seminar at 6pm (seminar location to be confirmed).
Speaker: Howard Bondell, Professor of Statistics and Data Science, University of Melbourne
Data and Decision-Making: Informative Missingness, Recommender Systems, and Personalised Medicine
In this talk, we will discuss two topics associated with the use of data for decision-making.
The first part of the talk investigates informative missingness in the framework of recommender systems. In this setting, we envision a potential rating for every object-user pair. The goal of a recommender system is to predict the unobserved ratings and then recommend an object that the user is likely to rate highly. A typically overlooked piece is that the combinations are not missing at random. For example, in movie ratings, a relationship between the user ratings and their viewing history is expected, as human nature dictates the user would seek out movies that they anticipate enjoying. We model this informative missingness, and place the recommender system in a shared-variable regression framework which can aid in prediction quality.
The second part of the talk deals with personalised medicine, which relies on the ability to prescribe patient-specific treatments. In this context, it is crucial to identify the variables that impact the optimal treatment decision. Typical variable selection techniques target on selecting variables that are important for prediction, which are not necessarily those that are important for treatment assignment. We propose a Gaussian process model in a backward elimination framework to identify the important variables in treatment decision making.
The Maurice H. Belz Lecture is an annual lecture established by the Statistical Society to honour the work of Professor Belz in establishing and advancing the science of statistics in Australia. Maurice Belz was the Foundation Professor of Statistics at the University of Melbourne (1955 to 1963).
The 2017 Belz lecture will be delivered on Tuesday October 31 by Distinguished Professor Noel Cressie, Director of the Centre for Environmental Informatics NIASRA. Noel will be sharing how he uses statistics to analyse remotely-sensed carbon dioxide. Noel’s research interests include theory and applications of spatial and spatio-temporal stochastic models; Bayes and empirical-Bayes methods for hierarchical statistical models; environmental informatics.
Follow this link for more details about Noel’s talk, A Bird’s-Eye View of Statistics for Remote Sensing Data.
Tuesday 26th July 2016
Species distribution models (and why I like working with statisticians)
with Associate Professor Jane Elith
Jane Elith is an Associate Professor in quantitative ecology, in the School of BioSciences at the University of Melbourne. Last year she won the Prime Minister’s early career researcher prize for Life Scientist of the Year, and the Academy of Science Fenner Medal. Jane specialises in species distribution models, statistical models that describe relationships between the occurrence or abundance of species and the environment. These models are used by both academics and practitioners, and Jane has been particularly interested in the methods and their appropriateness for the data and questions to which they are often applied. She has authored highly cited guides to methods, helped to develop and extend methods appropriate for typical data types, and tested methods and explored their uncertainties. Her research has applied significance because species distribution modelling is key in many aspects of species management, including understanding current distributions of threatened species, predicting how distributions might change in the future, supporting threat management, and controlling invasive species.
One of the reasons I was asked to give this talk is that I’ve mentioned publicly, several times, that I like working with statisticians and computer scientists. My background is in biological sciences, but I’ve ended up working in quite quantitative areas. My speciality, species distribution models, uses statistical and machine learning methods to model the response of species to their environments. These are static models, usually some form of regression. Species distribution models are very popular, and are one of the leading research fronts internationally in ecology and environmental sciences. I will describe the models, outline why they are popular, and give some examples of the sorts of complexities in the data and applications that mean there is much room for sound statistical input.
25th June, Dr Bhavani Raskutti, Pacific Brands. Analytical Model Development and Implementation – Experience from the field. [abstract]
28th May, Ben Rubinstein, IBM Research Melbourne. Information integration in industry: A case study. [abstract]
30th April, Blair Trewin, National Climate Centre, Bureau of Meteorology. Detecting climate change using real observations, real instruments and real people – how estimating trends is sometimes the easy part. [abstract]
27th November, Stephen Leslie, Murdoch Childrens Research Institute.
The People of the British Isles: A Statistical Analysis of the Genetics of the UK [abstract]
26th June, Kris Jamsen, School of Population Health, University of Melbourne
Determination and evaluation of optimal designs for population pharmacokinetic and pharmacokinetic-pharmacodynamic studies of anti-malarial drugs [abstract] [slides]
20th March, Jim Hanley, McGill University, Montreal
Comparing longevity of a group (e.g. Titanic survivors, US presidents) with that of a population: methods for/insights from finely stratified data [abstract] [slides]
27 September, Michael Smith, Melbourne Business School, The University of Melbourne
Bicycle commuting in Melbourne during the 2000s energy crisis: A semiparametric analysis of intraday volumes [abstract]
30 August, Luke Prendergast, La Trobe University
Simple dimension reduction methods with some useful insights via the influence function [abstract]
2 August, Valerie Isham, University College London
An introduction to the world of stochastic models: with examples from point processes, rainfall and epidemics [abstract]
28 June, Young Statisticians Present [abstracts]
31 May, Michael McCarthy, University of Melbourne
Applications and misapplications of statistics in ecology [abstract]
3 May, Steve Vander Hoorn, Statistical Consulting Centre, University of Melbourne
Comparative Risk Assessment: what is it and how can it be useful [abstract]
22 March, Nicole Watson
Re-engaging with survey non-respondents: the experience of three household panel surveys [abstract]
23 November, 2010
Small-domain estimation from statistics with measurement error
Professor Alan Zaslavsky
Commonly the large administrative, census, or survey datasets required to make estimates for small domains (areas, institutions, etc.) measure some important variables with nonsampling error. Supplementary information from a smaller survey may make it possible to estimate models for this measurement error and correct or calibrate small-domain estimates. This general structure is illustrated with three examples, each requiring a different model structure appropriate to the form of the available data and the error process: (1) imputation of corrected adjuvant therapy indicators in a cancer registry subect to underreporting of treatment; (2) estimation of school-level prevalence of serious emotional distress using a short screening scale; (3) combining information from a census, a post-enumeration survey, and followup evaluation studies to improve estimates of population. In each application a Bayesian hierarchical model is used to synthesize information from multiple sources.
2010 Maurice Belz Lecture – 14th October
Data based public debate: Why aren’t we at the centre of it?
Professor Chris Lloyd
There are many public policy issues that depend critically not just on data but on data that is both noisy and possibly biased. Does the MySchool website rank schools fairly? Did the economic stimulus save us from the GFC? How much did the government’s backdown on the mining tax cost them? What will the Australian population look like in 2050 under various scenarios and how do we compare with the rest of the world? How many people died as a result of the invasion of Iraq? Did the 1997 gun buyback reduce firearms related deaths?Does prison reduce crime? What is mainly responsible for the falling rate of road accidents? And let’s not forget that endangered polar bear in the room – climate change.
These are all hot public policy issues that depend on data that is noisy,uncontrolled, selectively available and where causation is difficult to establish. This is supposed to be our bread and butter, yet gallery journalists do not pick up the phone and ask us for comment. There is no statistical Ross Garnaut or data analytic Tim Flannery. Yet, as a profession, we could make large contributions to debate on the above topics. I will talk briefly about several of the issues listed above, and offer some examples of how well crafted statistical
summaries and graphics could inform public debate.
28th September, 2010
Rob Hyndman – Demographic forecasting using functional data analysis
Functional time series are curves that are observed sequentially in time. In demography, such data arise as the curves formed by annual death rates as a function of age or annual fertility rates as a function of age. I will discuss methods for describing, modelling and forecasting such functional time series data. Challenges include:
- developing useful graphical tools (I will illustrate a functional version of the boxplot);
- dealing with outliers (e.g., death rates have outliers in years of wars or epidemics);
- cohort effects (how can we identify and allow for these in the forecasts);
- synergy between groups (e.g, we expect male and female mortality rates to evolve in a similar way in the future);
- deriving prediction intervals for forecasts;
- how to combine the mortality and fertility forecasts to obtain forecasts of the total population.
I will illustrate the ideas using data from Australia and France.
24th August 2010
Young Statisticians Present: Pete Hickey, Minh Huynh, Martin Shield
Pete Hickey – X chromosome testing in genome wide association studies
Genome wide association studies (GWAS) have revealed fascinating insights into the genetics of complex diseases. These studies provide many statistical challenges but one problem that has received surprisingly little attention is the testing of associations between phenotype and genotype on the X chromosome. In this talk I will discuss the particular challenges of the X chromosome and present some results of a simulation study designed to compare several proposed methods for the analysis of X chromosome data. Pete completed a BSc (Honours) in statistics at the University of Melbourne in 2009. During that time he worked and studied in the Bioinformatics Division at the Walter and Eliza Hall Institute of Medical Research, where he is currently a research assistant in the statistical genetics group. Pete is interested in the development of statistical methods in molecular biology and genetics, with a particular focus on discovering the genetic causes of human disease.
Minh Huynh – Skills Acquistion in Badminton: A Visual Approach to training
Currently, there are a number of training programs that attempt to improve decision making and awareness in badminton. However, these programs are extremely limited, and do not provide athletes with the necessary improvements needed to optimise their in-game performance. In developing and improving decision making, the ideal strategy would be to expose the athlete to all possible situations and scenarios that they may face. This allows them to retain certain responses in their subconsciousness; leading their bodies to instantaneously select the appropriate action to take in similar situations. This paper provides an overview of the electronic training program currently being developed to improve reaction time and awareness in badminton players. Particular emphasis is placed on a player’s ability to estimate and predict shuttle location. Using this program, we will be able to identify the player’s awareness and attempt to improve their in-game performance and decision making. These findings will not be limited to badminton, and applicability to other sports will be discussed. Minh Huynh is studying a Masters in Statistics and Operations Research at RMIT.
Martin Shield – To check or not to check?
Consider an item which arrives at a check point, where it may be assessed against some benchmark before it can move on. We define an `inspection decision’ to be the choice between inspecting the item and allowing it to pass without inspection. Inspection decisions arise in many contexts. Examples include quality control in manufacturing, checking bags as customers leave department stores, and customs or quarantine inspections. I’m hoping that this talk will be interactive. I’ll start with a brief overview of traditional approaches to inspection decisions, then guide a discussion about these approaches and alternative ideas.
After completing Honours in Statistics at the University of Melbourne, Martin went to work as a statistician for ANZ. He has since returned to the University as a PhD candidate, working with Andrew Robinson, Owen Jones and Peter Hall on an applied statistics project.
27th July 2010
Data prediction competitions: far more than just a bit of fun
Founder of Kaggle.com
Kaggle is a global platform for data prediction competitions allowing researchers and companies to post their problem and have it scrutinised by the world’s statisticians and computer scientists.
By exposing a problem to a wide range of analysts and techniques, data prediction competitions turn out to be great way to get the most out of a dataset, given its inherent noise and richness. For example, Kaggle has been running a bioinformatics competition requiring participants to pick markers in HIV’s genetic sequence that predict a change in viral load (a measure of the severity of infection). Within a week and a half, the best submission had already outdone the best methods in the scientific literature.
Before founding Kaggle in 2009, Anthony worked in the macroeconomic modelling areas of the Reserve Bank of Australia and the Australian Treasury. During an internship at The Economist magazine in 2008, Anthony wrote an article about the use of data-driven decision making by companies. He became so fascinated by the power of data that he left his day-job as an econometrician to found Kaggle.
29th June, 2010
Development of the Index of Community Socio-Educational Advantage (ICSEA) and the generation of Statistically Similar School Group data
Dr Geoff Barnes
School and System Measurement and Evaluation Officer
Educational Measurement and School Accountability Directorate
NSW Department of Education and Training
Dr Barnes is presently the School and System Measurement and Evaluation Officer with the Educational Measurement and School Accountability Directorate (EMSAD) of the NSW Department of Education and Training (DET). His primary responsibility is the analysis of data from state-wide testing programs; NAPLAN, the NSW School Certificate and the NSW Higher School Certificate, and the use of this data to support school effectiveness initiatives in the NSW DET.
Prior to taking up his present position Dr Barnes was the senior research officer with the NSW DET Strategic Research Directorate and worked on a range of educational research and evaluation projects in conjunction with NSW university personnel. In recent years he has done extensive work exploring the relationship between community socio-educational factors and school outcomes. This work lead to his engagement by ACARA to undertake the development of the ICSEA. Dr Barnes’ teaching background is in secondary science. His doctoral thesis examined participation in senior secondary courses.
The presentation will discuss the development of the Index of School Community Educational Advantage (ICSEA) and its use in the generation of data for evaluating school outcomes. The ICSEA was developed by the Australian Curriculum and Reporting Authority (ACARA) for comparing outcomes of schools serving communities that are socioeconomically similar. It is constructed primarily from ABS census data using the technique of regression analysis. The ICSEA explains approximately 68% of the variation in outcomes in school outcomes nationally.
25th May, 2010
Trials on the edge – Can we really do randomized controlled trials of treatments in mental health?
Professor Andrew Mackinnon
Head, Statistics Unit, Orygen Youth Health Research Centre
Psychiatry and statistics developed rapidly during the 19th century but did so in relative isolation from one another. This may be a partial reflection of psychiatry’s status on the periphery of medicine itself.
The idea that the efficacy of treatments in psychiatry and mental health – other than medications – could be subjected to rigorous evaluation is relatively novel and is still sometimes contentious. Indeed, the implementation of such studies encounters nearly every possible difficulty that randomized controlled trials can face – uncertainty of diagnoses, issues of treatment delivery, blindness and integrity, difficulty in choosing and assessing outcomes, choice and recruitment of participants and high participant dropout rates.
This talk will introduce trials of non-drug treatments in mental health and present the challenges involved in undertaking such studies, describing achievements in this field and identify some of weakness and failures. I will indulge myself in suggesting some solutions to the problems of conducting these trials and what might be learnt from them for conducting randomized controlled trials in general.
27th April, 2010
Statistics and the search for genetic modifiers of iron levels in hereditary haemochromatosis
Associate Professor Lyle Gurrin
Centre for MEGA Epidemiology, Melbourne School of Population Health, The University of Melbourne
Hereditary haemochromatosis is an inherited disease of iron overload which can lead to arthritis, fatigue, diabetes and liver cirrhosis. More than 80% of patients with symptomatic iron overload-related disease are homozygous for the C282Y mutation in the HFE gene, but not all C282Y homozygotes have raised iron levels or develop symptoms of disease.
There is emerging evidence from family data and recent genomewide association studies that there are genetic modifiers of iron levels for HFE gene mutation carriers. To address the paucity of data on these associations derived from population studies we invited a random sample stratified by HFE genotype (C282Y and H63D) of 1438 from 31,192 participants of northern European descent in the Melbourne Collaborative Cohort Study to participate in our study of health and iron (“HealthIron”). Blood samples were processed for iron levels and genotyped for 476 genetic variants in 44 genes involved in iron metabolism. Analysis revealed a genetic variant in the CYBRD1 gene that was a novel modifier of iron levels specific to HFE C282Y homozygotes (so effectively a gene-gene interaction), associated with a three-fold decrease in iron levels for men, a five-fold decrease for women and accounting for more than 10% of the population variation in iron levels.
In this talk I will present a quick introduction to genetic association analysis, describe the statistical analysis strategy we used to identify the novel variant in CYBRD1, show data from the laboratory experiment that established the functional significance of the variant and outline our plan to replicate the results in a large cohort in the United States.
30th March, 2010
The Annual General Meeting of the Victorian Branch of the Statistical Society of Australia
Seminar – Are trackside betting markets efficient?
Dr. Owen Jones, Department of Mathematics and Statistics, The University of Melbourne
We explore the extent to which the decisions of participants in a speculative market effectively account for information contained in prices and price movements. The horserace betting market is chosen as an ideal environment to explore these issues. A conditional logit model is constructed to determine winning probabilities based on bookmakers’ closing prices and the time indexed movement of prices to the market close. Predictors are extracted from price (odds) curves using orthogonal polynomials. The results indicate that closing prices do not fully incorporate market price information, particularly information which is less readily discernible by market participants.
27 October, 2009. Belz lecture: Professor John Carlin
Filling in the missing values: Multiple imputation and the magic of applied statistics
About John Carlin
After completing an undergraduate major in Mathematics and Statistics at the University of Western Australia, and a PhD in Statistics at Harvard University, John Carlin has had over 20 years experience working as a biostatistician across a wide range of medical and public health research. He is Director of the Clinical Epidemiology and Biostatistics Unit within the Murdoch Children’s Research Institute and University of Melbourne Department of Paediatrics at the Royal Children’s Hospital, Melbourne, and has a professorial appointment in the Centre for Molecular, Environmental, Genetic & Analytic Epidemiology, School of Population Health, University of Melbourne. He is a founding member of the Steering Committee of the Biostatistics Collaboration of Australia, which is a national consortium that delivers a Masters program in biostatistics. Professor Carlin has a national and international reputation in biostatistics, with around 250 research publications, mainly in clinical and epidemiological journals. He is coauthor of a well-known textbook on Bayesian statistics with Andrew Gelman and others (Bayesian Data Analysis, 2nd edition 2003). Over the last few years he has maintained a program of methodological research on methods for dealing with missing data using multiple imputation, which grew out of collaborative research in which incomplete data are a major practical problem.
The Belz lecture
The method of multiple imputation is a two-stage approach for performing statistical inference in the presence of missing data, but it is widely misinterpreted as a method for “filling in” missing values. I will outline the theory underlying the method, describe current approaches for implementing it, and recent research on potential biases and limitations of the approach. The talk will have an emphasis on issues arising at the interface between our understanding of the statistical methods and the demands of substance-matter researchers eager for empirical findings.
29th September, 2009
Young Statisticians Present … Mithilesh Dronavalli, David Lazaridis, Davis McCarthy
Methodological issues encountered in integrating datasets — An integration of 4 cardiovascular datasets
Mithilesh is a medical student who detoured to do honours in applied biostatistics (cardiovascular randomised controlled trials), a Masters of Biostatistics, a cardiovascular work placement project, and an (almost completed) MPhil on applied biostatistics in radiation oncology. He is in 5th year medical school and is considering training in psychiatry with a major research interest in statistics applied to psychotherapy.
Integrating datasets involves analysing data coming from more than one database. It may involve estimation of values of variables for individuals for whom those variables were not measured. An example is the PlateletHOPE trial which measures clinical variables from history examination and novel biochemical markers. Changes in biochemical markers were investigated by measuring them before and after treatment with ACE inhibitor therapy; the aim was to elucidate the mechanism of ACE inhibitor treatment in cardiovascular disease. The larger SOLVD trial measured similar variables without biochemical markers and had clinical endpoints. This study developed linear regression models from variables common to both PlateletHOPE and SOLVD of the biochemical markers and used these to estimate values in the SOLVD dataset which had clinical endpoints. This way the cardiovascular relevance and interaction with ACE inhibitor treatment could be indirectly assessed in a highly cost-effective manner.
Penalized regression techniques for prediction: a case study for predicting tree mortality using remotely sensed vegetation indices
David has a Bachelor of Science with honours in statistics, and is currently doing a PhD supervised by Andrew Robinson at the University of Melbourne and Jan Verbesselt at the CSIRO.
This paper reviews a variety of methods for constructing regression models. We focus on predicting tree mortality from change metrics derived from Moderate-Resolution Imaging Spectroradiometer (MODIS) satellite images. The high dimensionality and collinearity inherent in such data are of particular concern. Standard regression techniques perform poorly for such data, so we examine shrinkage regression techniques such as Ridge Regression, the LASSO and Partial Least Squares, which yield more robust predictions. We also suggest efficient strategies that can be used to select optimal models such as 0.632+ Bootstrap and Generalized Cross Validation (GCV). The techniques are compared using simulations. The techniques are then used to predict insect-induced tree mortality severity for a Pinus radiata plantation in southern New South Wales and their prediction performances are compared. We find that shrinkage regression techniques outperform the standard methods, with ridge regression and the LASSO performing particularly well.
Accounting for biological variation in digital gene expression experiments
Davis is an Honours student in Statistics at the University of Melbourne, supervised by Dr Gordon Smyth at the Walter and Eliza Hall Institute of Medical Research (WEHI). During his undergraduate studies he worked part-time in the Bioinformatics Division at WEHI as part of the “Undergraduate Research Opportunities Program”. The experience was extremely valuable in leading him towards statistics and has led directly to his current Honours project.
The Human Genome Project of the 1990s catalysed the development of high-throughput, low(er)-cost DNA sequencing technologies. These have proved extremely valuable for genomics, their original application, but are also being applied in experiments investigating gene expression. Sequencing methods generate ‘counts’ of the number of times a particular gene is seen in an RNA sample, known as ‘digital gene expression’ (DGE) data. The number of counts for a gene gives an excellent indication of the true expression level of that gene in the biological sample, but assessing which genes are differentially expressed between experimental groups remains a difficult problem. Challenges include the small samples typical of biological experiments and trying to assess differential expression for tens of thousands of genes simultaneously. Poisson models are a natural and popular choice for modelling DGE data, but it has been shown that biological replication of samples introduces greater variability in the data than can be accounted for using the Poisson model. The negative binomial model offers greater flexibility in accounting for overdispersion relative to the Poisson model, and looks to be a promising approach for accounting for biological variation in DGE experiments. I will discuss the negative binomial modelling, estimation and testing methods developed in WEHI’s Bioinformatics Division that will form the bulk of my Honours thesis.
25th August, 2009
Professor Andrew Forbes, Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine
Issues arising when estimating the effects of lifestyle factors on disease outcomes
As an applied biostatistician working in the epidemiology area, the quick two minute corridor question can sometimes turn into a detailed investigation. In this talk I will discuss the follow-up of a query concerning evaluation of approaches to assess the effects of physical activity and body size on disease outcomes. The talk will cover both issues arising and some digressions, and in particular those of defining parameters to be estimated, uses of directed graphs, and a comparison of conventional regression/stratification methods for handling confounding with newer approaches for estimation in longitudinal studies with time varying exposures and covariates.
21st July, 2009
Professor Jane Watson and Associate Professor Rosemary Callingham, University of Tasmania
Helping Teachers and Students Become StatSmart
The talk focussed on the ARC Linkage project, StatSmart, led by the speakers in collaboration with the Australian Bureau of Statistics, Key Curriculum Press (distributors of TinkerPlots and Fathom software), and the Baker Centre for School Mathematics in Adelaide. After a brief introduction to the project and its aims for middle school teachers and their students, Jane will demonstrate the TinkerPlots software, especially its unique features that facilitate data handling and informal inference by middle school students. Rosemary will discuss some of the outcomes from the longitudinal design planned in order to show change in teachers’ knowledge for teaching statistics and in their students’ statistical literacy understanding.
23rd June, 2009
Professor Chris Lloyd Melbourne Business School, The University of Melbourne
Statistical Blogging: The Fishing in the Bay Story
Prior to the web, professional peers were linked though societies and the conferences and newsletters they sponsored. Interactions were limited to members and specific times. The first professional forums that used the web were professional lists (such as anzstat) and discussion boards (such as radstats) where participation was less restricted and communication more immediate. During the past five years a new platform, the weblog, has emerged. The number of weblogs has doubled every six months, yet only a small minority is oriented towards particular professional communities, let alone statisticians.In this talk I will talk about blogs in general and why I decided to establish one for the Australasian statistical community. Have you ever wondered how to set one up, how much time it takes to run and how I find material to post? Perhaps not, but I will tell you anyway. I will illustrate the advantages of the platform as I see it. Some of my favourite posts will be described in detail. I will finish by describing how FIB could be a much more effective space for our profession, with only a small but consistent input from my fellow statisticians.
26th May, 2009
Dr. Sally Wood, Melbourne Business School, The University of Melbourne
Mixtures-of-Experts Models – Uses and Pitfalls in Modelling Complex Data with Bayesian techniques
Mixtures-of-Experts models are a class of mixture models where the mixing weights depend upon the covariates. The flexibility and
interpretability of these models makes them attractive for modelling complex data. The talk will present a general overview of these models
and demonstrate their applications to finance, organizational behaviour and climate indicators.
28th April, 2009
Dr. Andrew Robinson, Department of Mathematics and Statistics, The University of Melbourne
Quarantine Inspection: How Risky a Business?
The application of statistical thinking to risk assessment in the context of quarantine inspection provides a rich and satisfying set of tools for problem solving. We develop a statistical framework for the allocation of inspection resources, and demonstrate its deployment.
17 March 2009
The 2009 Annual General Meeting of the Victorian Branch of the Statistical Society of Australia
Associate Professor Ian Gordon, Director, Statistical Consulting Centre, The University of Melbourne
Reading, writing and statistical thinking
H.G. Wells famously wrote: “Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write”. This day has surely come. Recently there has increased interest in delivering education in statistical literacy, and this talk describes our experience so far in such a course at the University of Melbourne. In 2008 Melbourne introduced a reform of its degree structure, known as the Melbourne Model. There are now only six “new generation” undergraduate degrees: Arts, Science, Commerce, Biomedicine, Environments and Music. Students are required to take 25% of their degree points as “breadth”: material outside the core of their degree. Special “University Breadth Subjects” were created, which have no pre-requisites; in particular, no mathematical background at year 12 can be assumed.We have developed a subject called “Critical thinking with data”. It has the clear and bold intention of teaching important elements of statistical science, with almost no mathematical treatment. This is an unusual strategy to attempt, and has considerable challenges. In this talk we present our approaches to content, delivery and assessment of the subject. We have made extensive use of visual and other media, integrating case studies from the press and elsewhere with the pedagogical content. Much of the background information is available to the students via our learning management system. We have eminent guest lecturers who provide inspiration from fields in which critical thinking about data is integral to their work.
We suggest that a subject with these aims and content is both possible and worthwhile. It means a markedly different orientation from moststatistical education, however, and requires innovative resources and educational perspectives.
28th October 2008 – Belz Lecture
Professor William Dunsmuir
Time Series That Count!
In recent years there has been rapid development of applications, models and methods for modelling of discrete valued time series. Importatant areas of application will be illustrated using examples from public health, traffic safety, hospital management, panel survey data, high-frequency financial data modelling and forecasting inventory. These applications require consideration of binary counts, binomial counts, Poisson counts and negative binomial counts. The lecture will focus on regression modelling of time series of counts in which the impact of covariates is assessed. Like all regression modelling in time series the possibility of serial dependence has to be considered since if it is ignored inferences about covariates can be incorrect. Unlike continuous valued time series, detection of serial dependence in count time series is difficult and at this stage underdeveloped. In view of this I will advocate a model based approach. Two broad classes of models, applicable to all types of count distribution mentioned above, will be reviewed and strengths and weaknesses (including computational) of the two will be compared. Illustration of the ideas will be made using examples from experience with the above areas of application. I will also show how the modelling ideas can be used to help in determining the properties of the missingness process in incompletely observed time series – this latter application will be discussed in the context of a pollution level time series that initiated some much earlier research in missing data methods for time series. The talk will be largely non-technical with the aim of exposing people, who are not experts in time series, to recent developments in this area. Some connections and differences with modelling longitudinal data will emerge in the talk.