当前位置: X-MOL 学术Precision Agric. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Effect of training sample size, sampling design and prediction model on soil mapping with proximal sensing data for precision liming
Precision Agriculture ( IF 6.2 ) Pub Date : 2024-02-24 , DOI: 10.1007/s11119-024-10122-3
Jonas Schmidinger , Ingmar Schröter , Eric Bönecke , Robin Gebbers , Joerg Ruehlmann , Eckart Kramer , Vera L. Mulder , Gerard B. M. Heuvelink , Sebastian Vogel

Abstract

Site-specific estimation of lime requirement requires high-resolution maps of soil organic carbon (SOC), clay and pH. These maps can be generated with digital soil mapping models fitted on covariates observed by proximal soil sensors. However, the quality of the derived maps depends on the applied methodology. We assessed the effects of (i) training sample size (5–100); (ii) sampling design (simple random sampling (SRS), conditioned Latin hypercube sampling (cLHS) and k-means sampling (KM)); and (iii) prediction model (multiple linear regression (MLR) and random forest (RF)) on the prediction performance for the above mentioned three soil properties. The case study is based on conditional geostatistical simulations using 250 soil samples from a 51 ha field in Eastern Germany. Lin’s concordance correlation coefficient (CCC) and root-mean-square error (RMSE) were used to evaluate model performances. Results show that with increasing training sample sizes, relative improvements of RMSE and CCC decreased exponentially. We found the lowest median RMSE values with 100 training observations i.e., 1.73%, 0.21% and 0.3 for clay, SOC and pH, respectively. However, already with a sample size of 10, models of moderate quality (CCC > 0.65) were obtained for all three soil properties. cLHS and KM performed significantly better than SRS. MLR showed lower median RMSE values than RF for SOC and pH for smaller sample sizes, but RF outperformed MLR if at least 25–30 or 75–100 soil samples were used for SOC or pH, respectively. For clay, the median RMSE was lower with RF, regardless of sample size.



中文翻译:

训练样本大小、抽样设计和预测模型对利用近端传感数据进行精确撒灰的土壤测绘的影响

摘要

特定地点的石灰需求估算需要土壤有机碳 (SOC)、粘土和 pH 值的高分辨率地图。这些地图可以通过数字土壤测绘模型生成,该模型适合近端土壤传感器观察到的协变量。然而,派生地图的质量取决于所应用的方法。我们评估了 (i) 训练样本大小 (5-100) 的影响;(ii) 抽样设计(简单随机抽样(SRS)、条件拉丁超立方抽样(cLHS)和k均值抽样(KM));(iii) 对上述三种土壤特性的预测性能的预测模型(多元线性回归(MLR)和随机森林(RF))。该案例研究基于条件地统计模拟,使用了来自德国东部 51 公顷田地的 250 个土壤样本。Lin 的一致性相关系数(CCC)和均方根误差(RMSE)用于评估模型性能。结果表明,随着训练样本量的增加,RMSE 和 CCC 的相对改进呈指数下降。我们发现 100 个训练观测值的 RMSE 中值最低,即粘土、SOC 和 pH 值分别为 1.73%、0.21% 和 0.3。然而,样本量已达到 10,就获得了所有三种土壤特性的中等质量模型(CCC > 0.65)。cLHS 和 KM 的表现明显优于 SRS。对于较小的样本量,对于 SOC 和 pH 值,MLR 的 RMSE 中值低于 RF,但如果分别使用至少 25-30 或 75-100 个土壤样本来测量 SOC 或 pH,则 RF 的表现优于 MLR。对于粘土,无论样本大小如何,使用 RF 时均方根误差中值较低。

更新日期:2024-02-24
down
wechat
bug