Member Login Join Now
The MLAI Meetup is a community for AI researchers and professionals which hosts monthly talks on exciting research. The format is:
6:00 - 6:20: Socializing
6:20 - 6:40: Announcements and AI news
6:40 - 7:40: Talk(s) and Q&A
7:40 - 8:00 Networking
8:00: Head to the nearest pub for dinner
Nonprobability samples like big data and administrative data are increasingly used in official statistics and machine learning, but statistics and predictions computed using only nonprobability samples often suffer from substantial selection bias and other sources of error. This can lead to inaccurate estimation, poor predictive performance and invalid statistical inference. We rectify the issue for a broad class of linear and nonlinear statistics and predictive models by producing estimating equations that combine probability and nonprobability samples in a way that also accommodates corrections for multiple sources of error, such as selection bias and measurement error. Consistency and asymptotic normality are established by building on existing design-based results for probability samples in the official statistics literature. We construct variance estimators that account for sampling variability introduced by corrections to various sources of error. These variance estimators can be produced with respect to either the sampling design alone or jointly with the superpopulation. We find that, in the limit, the joint variance of a sample statistic is equal to its design variance plus the superpopulation variance of the corresponding population parameter. A similar result holds for predictions produced via parametric machine learning models. Results are also applicable when using only a probability sample. We illustrate our method for quantiles, the Gini index, linear regression coefficients and maximum likelihood estimators. Our results are also illustrated in simulation and with real data about individual Australian incomes.
Speaker Bio: Dr Ryan Covey is a statistician with the Methodology and Data Science Division of the Australian Bureau of Statistics, where he works on research and statistical production in the areas of survey design and estimation, with a special focus on multi-source data. Ryan began his current role in 2022, after finishing a PhD in econometrics at Monash University on the relationship between sampling variability and predictive accuracy for forecast combinations and ensemble methods. Ryan also taught mathematics and data science as a teaching associate while completing his PhD. Prior to that, he held several roles at Telstra in data science, telecommunications and cyber security.
To register click here.
Statistical Society of Australia (SSA) PO Box 213 Belconnen ACT 2616 Australia 02 6251 3647www.statsoc.org.auABN 82 853 491 081
Please direct enquiries to:
the SSA Team via email at
contact@statsoc.org.au
© 2019 Statistical Society of Australia (SSA). All Rights Reserved. | member login
Website by Converge Design