Date: Thursday, 2nd May 2019
Time:
6:00pm - 6:30pm: Please be seated by 6.30pm. 
6:30pm - 7:30pm: Talk
Please note we require you to RSVP on eventbrite for this event. Register here.
Venue: Eastern Avenue Auditorium, Camperdown Campus, The University of Sydney
Dr Hadley Wickham
Chief Scientist at RStudio
The present and future of tidy data
Tidy data is a standard way of storing your data where columns are variables and rows are observations. Tidy data, particularly when coupled with tidy tools, makes data analysis easier because you can spend less time wrangling the output of one function so that it works as the input for another. Tidy data will make your analysis easier but how you get wild-caught data into a tidy form? In this talk, I'll discuss some of the tools that I have worked on for tidying data (e.g. the tidyr package), the limitations of those tools, and what I'm thinking about next. In particular, I'll discuss a new approach for "pivoting" data, and discuss some of the challenges posed by data stored in hierarchical form (e.g. JSON).
Biography
Hadley Wickham is currently Chief Scientist at RStudio and an adjunct Professor of statistics at the University of Auckland, Stanford University, and Rice University. He is best known for his development of open-source statistical software packages for R (programming language) that implement logics of data visualisation (ggplot2) and data transformation (dplyr and tidyr). Wickham's packages and writing are known for advocating a tidy data approach to data import, analysis and modelling methods. Wickham was named a Fellow by the American Statistical Association in 2015 for "pivotal contributions to statistical practice through innovative and pioneering research in statistical graphics and computing".