We build models for a few reasons. Sometimes we want to understand ecological patterns and processes. Other times we want to make predictions about what will happen at different times and in different places. Both explanatory and predictive models are integral to conservation science, but the latter probably receives less attention in the ecological literature.
One way to test the predictive ability of models is to use cross-validation. That is, partitioning the data into subsets (folds), building a model using one subset (training data), and testing the data on another subset (validation data). Another option is to test models using independent data collected at different times and locations. This is arguably a stronger test of model performance.
But independent data is hard to get. Right? Well, not always. There’s already a large amount of wildlife, spatial and historical data available to ecologists. Perhaps there’s more scope to test our models then is currently recognized. Here’s an example.
During my PhD I modelled the responses of several small mammal species to fire history. One species, the Mallee Ningaui, was strongly associated with fire history and I wanted to assess how well the model performed when tested on independent data. I put together a novel test dataset in three steps.
1. Wildlife data. I searched for small mammal surveys that had been undertaken in the study region, and obtained data from government agencies, conservation groups and other scientists. Two datasets used the same trapping methods as I had: including wildlife surveys completed in 1985-1987 (25 sites) and 2005-2008 (9 sites). 2. Spatial data. The fire history of these sites was unknown. So, I exported data into ArcMap and determined the post-fire age of each site using mapping my colleagues and I had recently completed. 3. Historical data. Some of these survey sites had not burnt within the time frame of available satellite imagery (1972-2007) and this meant that an alternative approach was required. So, I used historical maps of fires that occurred from the 1930s to 1972 to determine the post-fire age of the remaining sites. I then went into the field with my colleagues and ground-truthed the historical maps.
In summary, the test dataset was compiled relatively quickly using a combination of available wildlife data, spatial data and historical data. And it turned out our model performed quite well. The occurrence of the Mallee Ningaui could accurately be predicted using fire history and vegetation data, across a large geographical area. With the increasing amount of data available to scientists I think this is something we should do more of, particularly if we want our work to influence environmental decisions.
For more details check out this paper.
Kelly, L.T., Nimmo, D.G., Spence-Bailey, L.M., Haslem, A., Watson, S.J. Clarke, M.F.& Bennett, A.F. (2011) The influence of fire history on small mammal distributions: insights from a 100-year post-fire chronosequence. Diversity and Distributions, 17, 462-473. Abstract