Branch Meetings

Branch Meetings for 2017 Poster

​​Statistical Society of Australia Canberra Branch Meeting

Date: Tuesday 30 May 2017


5.15pm Refreshments in Room 1175A, John Dedman Building, ANU [Building 27]

6.00pm Presentation by Sandy Burden in Room G035, John Dedman Building, ANU
7.30pm After the talk, there will be a dinner at Briscola (

Please RSVP Francis Hui (SSA Canberra or reply to this email directly) by Monday 29 May if you would like to attend the dinner. 

Speaker: Ian Renner, University of Newcastle

Topic: Species distribution models with point process models (and extensions!)


In species distribution modelling, the goal is to build a model which predicts the intensity of a species (or group of species) as a function of environmental variables. Such models rely on information about the environment and information about species presence. While systematic survey data in which the presence and absence/non-detection of a species is recorded at each site is preferable, often the best available data comes in the form of “presence-only data”, which consist of a list of locations where the species has been reported to have been observed. Such “citizen science” data is cheap and abundant, but offers unique statistical challenges to estimate the true surface of the intensity of the species.
The most popular methods for fitting species distribution models to presence-only data include Maxent and pseudo-absence logistic regression, yet both of these approaches are subject to challenges in model implementation, interpretation, and assumption checking which can make ecological inference very difficult. For example, these methods require a number of pseudo-absences to be introduced, and there is a large set of literature exploring questions of how these should be chosen. Furthermore, neither approach offers any solution to the potential for spatial dependence in the points. Finally, presence-only data is subject to “observer bias”, as the list of reported locations reflects the distribution of observers as well as the distribution of the target species.
In this talk, I will discuss PPM-LASSO, a method I have developed for fitting species distribution models to presence-only data. This method fits point process models to the presence data along with a Lasso-type penalty to improve predictive performance. I will discuss links that point process models have with Maxent and pseudo-absence regression, and demonstrate the advances such this flexible framework offers to presence-only analysis  in addressed challenges posed by the nature of presence-only data, including the choice of pseudo-absences, diagnosing inherent assumptions of the model, and accounting for observer bias, among others. Furthermore, I will discuss extensions of PPM-LASSO to contexts in which species data is available across multiple sources, such as fitting a model to both presence-only data and repeated survey data with a combined point process and occupancy model.
Speaker Biography:
I completed my PhD at the University of New South Wales under the supervision of David Warton, and began a position as a lecturer at the University of Newcastle in 2014.

My research interests are broadly in ecological statistics, in which I explore the relationship between the environment and the spatial distribution of a species or species communities.

In particular, I explore methods for species distribution models, in which the goal is to predict the intensity of a species as a function of various environmental variables. Much of my work involves the use of presence-only data, a form of citizen science in which the available data are the locations of reported sightings, with no corresponding information regarding species absences. The use of presence-only data creates a number of challenges in model fitting and interpretation. One such issue is the effect of non-uniform sampling effort: certain regions may be more surveyed than others in presence-only data sets, and failing to account for such differences leads to biased inference. Another issue is that of spatial dependence: there are a number of reasons why reported locations may not be independent of each other, and understanding these reasons is crucial in appropriately modelling the data.

I also consider techniques to perform model selection in the context of species distribution models. As data describing environmental conditions continues to become available for an increasing number of environmental variables across finer spatial resolutions, it is important to consider which of these variables should be included in the model.

I have developed an approach to species distribution modelling which accounts for these challenges called “PPM-LASSO”, in which point process models are fitted with a Lasso-type penalty to simultaneously perform model selection and improve predictive performance. I have built tools to fit these models in the R package ppmlasso available on CRAN.

Recently, I have explored fitting models that make use of multiple data types including presence-absence data collected through systematic surveys. My current research aims to tackle the question of how to best combine these data types to account for their various properties.

Francis Hui
SSA Canberra

Get the latest posts delivered to your mailbox:

Show Buttons
Hide Buttons