Journal cover Journal topic
SOIL An interactive open-access journal of the European Geosciences Union
© Author(s) 2017. This work is distributed under
the Creative Commons Attribution 3.0 License.
Original research article
09 May 2017
Review status
This discussion paper is a preprint. It is a manuscript under review for the journal SOIL (SOIL).
Evaluation of digital soil mapping approaches with large sets of environmental covariates
Madlene Nussbaum1, Kay Spiess1, Andri Baltensweiler2, Urs Grob3, Armin Keller3, Lucie Greiner3, Michael E. Schaepman4, and Andreas Papritz1 1Institute of Biogeochemistry and Pollutant Dynamics, ETH Zurich, Universitätstrasse 16, CH-8092 Zürich, Switzerland
2Swiss Federal Institute for Forest, Snow and Landscape Research (WSL), Zürcherstrasse 111, CH-8903 Birmensdorf, Switzerland
3Research Station Agroscope Reckenholz-Taenikon ART, Reckenholzstrasse 191, CH-8046 Zürich, Switzerland
4Remote Sensing Laboratories, University of Zurich, Wintherthurerstrasse 190, CH-8057 Zurich, Switzerland
Abstract. Spatial assessment of soil functions requires maps of basic soil properties. Unfortunately, these are either missing for many regions or are not available at the desired spatial resolution or down to required soil depth. Conventional soil map generation remains costly. Field based generation of large soil data sets and of conventional soil maps remains costly. Meanwhile, soil legacy data and comprehensive sets of spatial environmental data are available for many regions.

Digital soil mapping (DSM) approaches – relating soil data (responses) to environmental data (covariates) – are facing the challenge to build statistical models from large sets of covariates originating for example from airborne imaging spectroscopy or multi-scale terrain analysis. We evaluated six approaches for DSM in three study regions in Switzerland (Berne, Greifensee, ZH forest) by mapping effective soil depth available to plants (SD), pH, soil organic matter (SOM), effective cation exchange capacity (ECEC), clay, silt, gravel content and bulk density for four soil layers (totalling 48 responses). Models were built from 300–500 environmental covariates by selecting linear models by (1) grouped lasso and by an ad-hoc stepwise procedure for (2) robust external-drift kriging (EDK). For (3) geoadditive models we selected penalized smoothing spline terms by componentwise gradient boosting (geoGAM). We further used two tree-based methods: (4) boosted regression trees (BRT) and (5) Random Forest (RF). Lastly, we computed (6) weighted model averages (MA) from predictions obtained from methods 1–5.

Lasso, georob and geoGAM successfully selected strongly reduced sets of covariates (subsets of 3–6 % of all covariates). To automatically select a sparse trend model for EDK was however difficult, and the applied ad hoc procedure was computationally inefficient and over-fitted the data. Differences in predictive performance, tested on independent validation data, were mostly small and did not reveal a single best method for 48 responses. Nevertheless, RF was on average often best among methods 1–5 (28 of 48 responses), but was outcompeted by MA for 14 of these 28 responses. RF tended to over-fit the data. Performance of BRT was slightly worse than RF. GeoGAM performed poorly on some responses and was only best for 7 of 48 responses. Predictive precision of lasso was intermediate. All models generally had small bias. Only the computationally very efficient lasso had slightly larger bias likely because it tended to under-fit the data. Summarizing, although differences were small, the frequencies of best and worst performance clearly favoured RF if a single method is applied MA if multiple prediction models can be developed.

Citation: Nussbaum, M., Spiess, K., Baltensweiler, A., Grob, U., Keller, A., Greiner, L., Schaepman, M. E., and Papritz, A.: Evaluation of digital soil mapping approaches with large sets of environmental covariates, SOIL Discuss.,, in review, 2017.
Madlene Nussbaum et al.
Madlene Nussbaum et al.
Madlene Nussbaum et al.


Total article views: 680 (including HTML, PDF, and XML)

HTML PDF XML Total Supplement BibTeX EndNote
473 194 13 680 19 6 9

Views and downloads (calculated since 09 May 2017)

Cumulative views and downloads (calculated since 09 May 2017)

Viewed (geographical distribution)

Total article views: 680 (including HTML, PDF, and XML)

Thereof 679 with geography defined and 1 with unknown origin.

Country # Views %
  • 1



Latest update: 24 Sep 2017
Publications Copernicus
Short summary
This manuscript presents an extensive evaluation of digital soil mapping (DSM) tools. Recently, large sets of environmental covariates – e.g. from analysis of terrain at multiple scales – have become more common for DSM. Many DSM studies, however, only compared DSM methods using not more than 30 covariates or tested approaches only on few responses. We built DSM models from 300–500 covariates using 6 approaches being either popular in DSM or promising for large covariate sets.
This manuscript presents an extensive evaluation of digital soil mapping (DSM) tools. Recently,...