Mapping of soil properties at high resolution in Switzerland using boosted geoadditive models
Madlene Nussbaum1, Lorenz Walthert2, Marielle Fraefel2, Lucie Greiner3, and Andreas Papritz11Institute of Biogeochemistry and Pollutant Dynamics, ETH Zurich, Universitätstrasse 16, 8092 Zürich, Switzerland 2Swiss Federal Institute for Forest, Snow and Landscape Research (WSL), Zürcherstrasse 111, 8903 Birmensdorf, Switzerland 3Research Station Agroscope Reckenholz-Taenikon ART, Reckenholzstrasse 191, 8046 Zürich, Switzerland
Abstract. High-resolution maps of soil properties are a prerequisite for assessing soil threats and soil functions and to foster sustainable use of soil resources. For many regions in the world precise maps of soil properties are missing, but often sparsely sampled and discontinuous (legacy) soil data are available. Soil property data (response) can then be related by digital soil mapping (DSM) to spatially exhaustive environmental data that describe soil forming factors (covariates) to create spatially continuous maps. With air- and spaceborne remote sensing data and multi-scale terrain analysis large sets of covariates have become common. Building parsimonious models, amenable to pedological interpretation, is then a challenging task.
We propose a new boosted geoadditive modelling framework (geoGAM) for DSM. A geoGAM models smooth nonlinear relations between responses and single covariates and combines these model terms additively. Residual spatial autocorrelation is captured by a smooth function of spatial coordinates and nonstationary effects are included by interactions between covariates and smooth spatial functions. The core of fully automated model building for geoGAM is componentwise gradient boosting.
We illustrate the application of the geoGAM framework by using soil data from the Canton of Zurich, Switzerland. We modelled effective cation exchange capacity (ECEC) in forest topsoils as continuous response. For agricultural land we predicted the presence of waterlogged horizons in given soil depth layers as binary and drainage classes as ordinal responses. For the latter we used proportional odds geoGAM taking the ordering of the response properly into account. Fitted geoGAM contained only few covariates (7 to 17) selected from large sets (333 covariates for forests, 498 for agricultural land). Model sparsity allowed covariate interpretation by partial effects plots. Prediction intervals were computed by model-based bootstrapping for ECEC. Predictive performance of the fitted geoGAM, tested with independent validation data and specific skill scores (SS) for continuous, binary and ordinal responses, compared well with other studies that modelled similar soil properties. SS of 0.23 up to 0.53 (with SS = 1 for perfect predictions and SS = 0 for zero explained variance) were achieved depending on response and type of score. geoGAM combines efficient model building from large sets of covariates with ease of effect interpretation and therefore likely raises the acceptance of DSM products by end-users.
Nussbaum, M., Walthert, L., Fraefel, M., Greiner, L., and Papritz, A.: Mapping of soil properties at high resolution in Switzerland using boosted geoadditive models, SOIL Discuss., doi:10.5194/soil-2017-13, in review, 2017.