Species Abundance Models (SAMs) are used to model and predict spatial variation in species abundances. While environmental data are relatively easier to gather, they do not always provide the many required sources of variation to reliably predict spatial variation in abundances. Both species interactions and missing or mismeasured environmental predictors can reduce SAMs predictive performance. Some SAMs have used presence-absence data of other species as predictors as an easy-to-gather proxy for species interactions and missing environmental predictors. However, it is unknown how well this approach works. We investigate potential factors that may lead to lower their predictive abilities.
We focus on generalised linear latent variable models (GLLVM) to model community composition that are then included in SAMs to represent missing predictors. We use simulations and a large empirical fish dataset to assess its performance. Our simulations consist of generating species abundances across a large number of sites (universe); we then sample a much smaller number of sites to estimate our latent-based SAMs, and test how well these models predict the abundance for the remaining sites in the universe. We use different scenarios to generate random and systematic perturbations to assess the predictive ability of the proposed latent-based SAM.
Results/Conclusions
By identifying the conditions in which model predictability is affected, we were able to design strategies to minimize these errors. Our first step was to show, through simulations, that, in the absence of species interactions, our latent-based abundance models were almost as good as when using the original environmental variables or a combination of these environmental variables and community composition. Next, we introduced perturbations and show that while these lead to a decrease in SAMs predictive abilities, the introduction of community composition in the models mitigates this decrease. The empirical lake-fish abundances are used to evaluate when and how different strategies designed to reduce random and systematic errors improve the performance of our proposed latent-based SAM.