TY - JOUR
T1 - Soil moisture, organic carbon, and nitrogen content prediction with hyperspectral data using regression models
AU - Datta, Dristi
AU - Paul, Manoranjan
AU - Murshed, Manzur
AU - Teng, Shyh Wei
AU - Schmidtke, Leigh
PY - 2022/10/20
Y1 - 2022/10/20
N2 - Soil moisture, soil organic carbon, and nitrogen content prediction are considered significant fields of study as they are directly related to plant health and food production. Direct estimation of these soil properties with traditional methods, for example, the oven-drying technique and chemical analysis, is a time and resource-consuming approach and can predict only smaller areas. With the significant development of remote sensing and hyperspectral (HS) imaging technologies, soil moisture, carbon, and nitrogen can be estimated over vast areas. This paper presents a generalized approach to predicting three different essential soil contents using a comprehensive study of various machine learning (ML) models by considering the dimensional reduction in feature spaces. In this study, we have used three popular benchmark HS datasets captured in Germany and Sweden. The efficacy of different ML algorithms is evaluated to predict soil content, and significant improvement is obtained when a specific range of bands is selected. The performance of ML models is further improved by applying principal component analysis (PCA), a dimensional reduction method that works with an unsupervised learning method. The effect of soil temperature on soil moisture prediction is evaluated in this study, and the results show that when the soil temperature is considered with the HS band, the soil moisture prediction accuracy does not improve. However, the combined effect of band selection and feature transformation using PCA significantly enhances the prediction accuracy for soil moisture, carbon, and nitrogen content. This study represents a comprehensive analysis of a wide range of established ML regression models using data preprocessing, effective band selection, and data dimension reduction and attempt to understand which feature combinations provide the best accuracy. The outcomes of several ML models are verified with validation techniques and the best- and worst-case scenarios in terms of soil content are noted. The proposed approach outperforms existing estimation techniques.
AB - Soil moisture, soil organic carbon, and nitrogen content prediction are considered significant fields of study as they are directly related to plant health and food production. Direct estimation of these soil properties with traditional methods, for example, the oven-drying technique and chemical analysis, is a time and resource-consuming approach and can predict only smaller areas. With the significant development of remote sensing and hyperspectral (HS) imaging technologies, soil moisture, carbon, and nitrogen can be estimated over vast areas. This paper presents a generalized approach to predicting three different essential soil contents using a comprehensive study of various machine learning (ML) models by considering the dimensional reduction in feature spaces. In this study, we have used three popular benchmark HS datasets captured in Germany and Sweden. The efficacy of different ML algorithms is evaluated to predict soil content, and significant improvement is obtained when a specific range of bands is selected. The performance of ML models is further improved by applying principal component analysis (PCA), a dimensional reduction method that works with an unsupervised learning method. The effect of soil temperature on soil moisture prediction is evaluated in this study, and the results show that when the soil temperature is considered with the HS band, the soil moisture prediction accuracy does not improve. However, the combined effect of band selection and feature transformation using PCA significantly enhances the prediction accuracy for soil moisture, carbon, and nitrogen content. This study represents a comprehensive analysis of a wide range of established ML regression models using data preprocessing, effective band selection, and data dimension reduction and attempt to understand which feature combinations provide the best accuracy. The outcomes of several ML models are verified with validation techniques and the best- and worst-case scenarios in terms of soil content are noted. The proposed approach outperforms existing estimation techniques.
KW - band selection
KW - k-fold cross validation
KW - LUCAS data
KW - machine learning
KW - principal component analysis
UR - http://www.scopus.com/inward/record.url?scp=85140933431&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85140933431&partnerID=8YFLogxK
U2 - 10.3390/s22207998
DO - 10.3390/s22207998
M3 - Article
C2 - 36298349
SN - 1424-8220
VL - 22
JO - Sensors (Switzerland)
JF - Sensors (Switzerland)
IS - 20
M1 - 7998
ER -