Estimating soil organic carbon stocks using different modelling techniques in the semi-arid rangelands of eastern Australia

Bin Wang, Cathy Waters, Susan Orgill, Annette Cowie, Anthony Clark, De Li Liu, Marja Simpson, Ian McGowen, Tim Sides

    Research output: Contribution to journalArticlepeer-review

    105 Citations (Scopus)


    Soil organic carbon (SOC) is pivotal for biological, chemical and physical processes and provides vital information on changes in soil fertility and land degradation. Rangelands, accounting for about 81% of Australian land area, are significant carbon (C) stores and small increases in soil C sequestration over such a vast area represents a considerable climate change mitigation opportunity. Efficient modelling techniques to evaluate the potential to increase rangeland SOC stocks are vitally important to assess their role in the global carbon cycle and quantum abatement. This study aimed to evaluate boosted regression trees (BRT) and random forest (RF) models in predicting SOC stocks from available continuous remotely sensed variables using two feature selection techniques. Dominant variables that affect SOC stocks in the rangelands were also identified. Using field-based measurements of SOC stock collected from 564 data points across the study area and 28 GIS-based environmental variables including climate, topography, radiometry, vegetation and land fractional cover data, we employed stepwise regression (SR, linear approach) and genetic algorithm (GA, nonlinear approach) to select the most informative variables. These selected predictors were then used to train the BRT and RF models. In all, four models were evaluated: BRT using SR (SR_BRT); RF using SR (SR_RF); BRT using GA (GA_BRT) and RF using GA (GA_RF). In addition, BRT using all predictors (All_BRT) and the RF using all predictors (All_RF) were used as benchmarks to test the performance of the four models. Of the field-based data, 75% were used to train the model (“calibration dataset”) and the remaining 25% were used to validate the prediction of SOC stocks (“validation dataset”). The results indicate that the RF exhibited a better performance in predicting SOC stocks than the BRT regardless of input variables. In addition, we verified that feature selection for both machine learning techniques is necessary for estimating SOC stocks because they can increase accuracy and save time. The GA_RF was the most reliable method to predict SOC stocks, with the lowest root mean square error (RMSE) and the highest R2 values (7.44 Mg C ha−1 and 0.48, respectively), suggesting that the method of using GA-RF to generate a predictive model from measured data and remotely-sensed variables may provide a cost effective alternative to direct sampling to predict SOC stocks in the semi-arid rangelands of eastern Australia. The important variables for explaining the observed SOC stocks were rainfall, elevation, Prescott index (PI, a measure of water balance), and land fractional cover (bare ground fraction). The approach proposed here can be extended in areas where field observed data is scarce (e.g. rangelands) to produce more detailed information about SOC stocks. As such, the results of our study are of particular importance in Australian rangelands to provide a statistical and theoretical basis for producing digital SOC stock maps based on readily available remotely-sensed data, with potential for use in similar rangelands conditions internationally.
    Original languageEnglish
    Pages (from-to)425-438
    Number of pages14
    JournalEcological Indicators
    Early online date02 Feb 2018
    Publication statusPublished - May 2018


    Dive into the research topics of 'Estimating soil organic carbon stocks using different modelling techniques in the semi-arid rangelands of eastern Australia'. Together they form a unique fingerprint.

    Cite this