Comparison of Bagged Multivariate Adaptive Regression Splines (MARS) and Boosted Regression Trees (BRT) using spatial autocorrelation for modelling the habitat niche of four tree species in the Mount Lofty Ranges, South Australia

Author: Yue Zhuo

Zhuo, Yue, 2017 Comparison of Bagged Multivariate Adaptive Regression Splines (MARS) and Boosted Regression Trees (BRT) using spatial autocorrelation for modelling the habitat niche of four tree species in the Mount Lofty Ranges, South Australia, Flinders University, School of the Environment

Terms of Use: This electronic version is (or will be) made publicly available by Flinders University in accordance with its open access policy for student theses. Copyright in this thesis remains with the author. You may use this material for uses permitted under the Copyright Act 1968. If you are the owner of any included third party copyright material and/or you believe that any material has been made available without permission of the copyright owner please contact copyright@flinders.edu.au with the details.

Abstract

The habitat niche of four Australia native trees in the Mount Lofty Ranges of South Australia have been modelled: Allocasuarina verticillata, Eucalyptus fasciculosa, Eucalyptus goniocalyx and Eucalyptus obliqua. Two non-parametric modelling techniques have been compared: bagged Multivariate Adaptive Regression Splines (MARS - Friedman 1991) and Boosted Regression Trees (BRT - Freund & Schapire 1996) modelling. Each regression model was conducted using presence/absence data and a set of 34 environmental variables. Among these predictors, climate data, including rainfall and temperature variables, were found to be most contributed determinants of the distribution of selected trees.

In order to combat spatial autocorrelation, a method was used to re-sample the data, separating the distances between sample points. This was compared with an entirely different method where an index of spatial autocorrelation was explicitly incorporated in each model. Spatial weights set did not contribute to bagged MARS models while it slightly altered the structure of BRT models. However, spatial variable was not a crucial predictor and it cannot significantly affect the response (occurrence of trees).

The performance of each model was evaluated through the Area Under Curve (AUC) values of Receiver Operator Characteristic (ROC) analysis. Generally, all models in this study performed well. However, BRT models had better fit of data than bagged MARS algorithm and shown relatively stable prediction results.

As the sample size of data was limited, not enough data could be set aside for independent testing. Instead, final prediction surfaces of habitat niche were compared with an expert opinion approach undertaken by the Department of Environment, Water and Natural Resources, South Australia.

Keywords: Modelling, presence/absence, explanatory variables, MARS, BRT, spatial autocorrelation, habitat niche, ROC , threshold, distribution probabilities.

Subject: Environmental Science thesis

Thesis type: Masters
Completed: 2017
School: School of the Environment
Supervisor: Dr. Mark Lethbridge