Sponsored by the ARC Centre of excellence for Mathematical and Statistical Frontiers (http://acems.org.au)
Sydney, 13 October 2015
About the Course:
R is a powerful language for statistical analysis and visualization. However, most of its power is restricted to data of small or moderate size. Using Tessera, users can readily visualize and analyse large complex data sets in a familiar R environment, making use of the thousands of methods for analysis, visualization, and machine learning that are available in R.
Developed over the past two years as part of the DARPA XDATA program in the United States, Tessera (http://tessera.io) is an open source statistical computing environment that enables R users to perform deep analysis of large, complex data sets. Principal contributors to the project are statisticians and computer scientists at Purdue University and Pacific Northwest National Laboratory.
Tessera uses the Divide and Recombine (D&R) approach. In D&R, data are divided into meaningful subsets, embarrassingly parallel computations are performed on the subsets, and results are combined in a statistically valid manner. Using the R datadr package, Tessera provides a simple interface to distributed parallel back end computation environments such as Hadoop. Tessera includes a visualization component, Trelliscope, which provides a D&R approach for detailed, flexible, and interactive visualization of large complex data.
An overview of Divide and Recombine and Tessera will be provided, followed by a hands-on introduction to the Tessera R packages datadr and Trelliscope. After providing a practical feel for using Tessera for statistical analysis and visualization on small data sets, more in-depth hands-on examples will be provided using a larger data set, a one year collection of Taxi ridership data in New York City.
Attendees should have basic proficiency with R and RStudio. Attendees should have a laptop with the following installed:
- R 3.2.X
- A recent version of RStudio
- An up-to-date web browser, Chrome/Safari/Firefox
- The datadr package
- The trelliscope package
For more details about the short course and installation instructions, please visit http://tessera.io/docs-UTS-shortcourse/.
About the Instructor:
Ryan Hafen is a statistical consultant and an adjunct assistant professor in the Statistics Department at Purdue University. Ryan’s research focuses on methodology, tools, and applications in exploratory analysis, statistical model building, and machine learning on large, complex datasets. He is the developer of the datadr and Trelliscope components of the Tessera project (tessera.io), as well as the rbokeh visualization package. Prior to his work as a statistical consultant, Ryan worked at Pacific Northwest National Laboratory, doing applied work on analyzing large complex data spanning many domains, including power systems engineering, nuclear forensics, high energy physics, biology, and cyber security. Ryan has a B.S. in Statistics from Utah State University, M.Stat. in Mathematics from University of Utah, and Ph.D. in Statistics from Purdue University.
SSAI Members – $450
SSAI Student Members – $200
Non-SSAI Members – $600
Non-SSAI Student Members** – $300
** Proof of Valid University ID required. Please email to [email protected]
Registrations close strictly on 8 October 2015 or earlier if the course is full.
Course cost is being subsidized by the ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS).
University of Technology Sydney, Room CB07:03:010G (ground floor of building 7, located at 638 Jones Street, Broadway (http://maps.uts.edu.au/map.cfm)
Occasionally workshops have to be cancelled due to a lack of subscription. Early registration ensures that this will not happen. Please contact the SSAI Office before making any travel arrangements to confirm that the workshop will go ahead, because the SSAI will not be held responsible for any travel or accommodation expenses incurred due to a workshop cancellation.
Cancellations received prior to Tuesday, 6 October 2015 will be refunded in full. Confirmation of the refund having been processed will be emailed. Should additional documentation pertaining to the refund be required, a $20 administration fee will be charged.
After 6 October 2015 no part of the registration fee will be refunded. However, registrations are transferable within the same organisation. Please advise any changes to [email protected].
Would you please note that the SSAI Office will be closed from 24 September until 5 October 2015. For enquiries with regard to the contents of the workshop please contact Louise Ryan
Administrative queries received during this time will be answered when the office reopens on 6 October 2015.
|Time:||9:00 am - 5:00 pm|
|Cost:||from $200 (students)|
|Location:||University of Technology Sydney,
638 Jones Street,