聂小军,赵星辉,AMMARAGill,等. 基于光谱特征指数与机器学习的矿区土壤煤源碳质量分数反演[J]. 煤炭学报,2023,48(7):2869−2880. DOI: 10.13225/j.cnki.jccs.CN23.0329
引用本文: 聂小军,赵星辉,AMMARAGill,等. 基于光谱特征指数与机器学习的矿区土壤煤源碳质量分数反演[J]. 煤炭学报,2023,48(7):2869−2880. DOI: 10.13225/j.cnki.jccs.CN23.0329
NIE Xiaojun,ZHAO Xinghui,AMMARA Gill,et al. Inversion of coal-derived carbon content in mine soils based on hyperspectral index and machine learning[J]. Journal of China Coal Society,2023,48(7):2869−2880. DOI: 10.13225/j.cnki.jccs.CN23.0329
Citation: NIE Xiaojun,ZHAO Xinghui,AMMARA Gill,et al. Inversion of coal-derived carbon content in mine soils based on hyperspectral index and machine learning[J]. Journal of China Coal Society,2023,48(7):2869−2880. DOI: 10.13225/j.cnki.jccs.CN23.0329

基于光谱特征指数与机器学习的矿区土壤煤源碳质量分数反演

Inversion of coal-derived carbon content in mine soils based on hyperspectral index and machine learning

  • 摘要: 煤颗粒作为一种高碳有机质,即使少量扩散至土壤中便会以煤源碳的干扰形式导致土壤有机碳(植物源碳)质量分数的明显高估,从而增加土壤碳固存评估的不确定性。然而,目前缺少土壤煤源碳定量评估的有效方法。本研究以具有100多年无烟煤开采历史、土壤中煤颗粒累积(煤累积土壤)普遍的河南焦作矿区为研究区,采集当地的煤粉与不含煤颗粒的土壤,配置煤源碳质量分数不同的土壤样品,利用高光谱遥感技术分析了矿区土壤的光谱特征。综合8种光谱数学变换方法和2种光谱特征筛选方法,构建了6种矿区土壤煤源碳质量分数反演模型,包括3种光谱特征指数(弓曲差、差值指数、比值指数)、3种机器学习(偏最小二乘回归(PLSR)、支持向量机(SVM)与随机森林(RF))模型。本研究也对最优反演模型的适用性进行了检验。研究发现:波长350~2500 nm内,煤的光谱曲线特征明显不同于植物源有机质与不含煤颗粒的土壤,煤累积土壤的光谱反射率(R)随煤源碳质量分数的增加而减小,这为高光谱遥感技术定量反演土壤煤源碳提供了理论基础。在光谱特征筛选方面,综合竞争性自适应重加权采样法(CARS)提取出的煤源碳特征波段数远高于光谱特征指数相关系数法,而且特征波段在波长350~2 500 nm内分布均匀。原始光谱R经光谱数学变换后,构建的弓曲差、差值、比值光谱特征指数反演模型对土壤煤源碳质量分数的估测精度明显提升,其中,基于倒数1/R变换的差值指数模型反演效果最好。相较于光谱特征指数模型,结合CARS的机器学习模型对煤源碳质量分数的估测精度进一步提升。3种机器学习模型中,1/R-CARS-RF煤源碳质量分数反演模型的估测精度最高,验证集R_\rmm^2 =0.998、RMSE=0.348、RPD=29.943。适用性检验表明,1/R-CARS-RF煤源碳质量分数反演模型的适用性良好(RMSE=1.88%、RPD=4.97),可以较准确地估测焦作矿区土壤中的煤源碳质量分数。本研究可为矿区土壤煤源与植物源碳区分、土壤碳固存精确评估提供方法支撑。

     

    Abstract: Coal particles, a typical of organic matter with high carbon content, widely diffused in soil environment due to massive consumption of coal energy for centuries. Even in small amounts presented in soils, coal particles can produce obvious overestimation of soil organic carbon (SOC) and thus increase an uncertainty of soil C sequestration assessment. However, there is a lack of determination methods for distinguishing coal-derived C from SOC. This study takes the Jiaozuo mining area as the study area, which has a history of anthracite mining for more than 100 years and where coal-contaminated soils are widespread. The hyperspectral characteristics of the mine soils were analyzed by collecting coal particles and coal-free soils and then mixing known quantities of the two samples manually. Based on eight spectral mathematical transformation and two spectral feature screening methods (i.e., traditional correlation coefficiencies, and comprehensive competitive adaptive reweighted sampling (CARS)), six inversion models including spectral feature index models (i.e., deviation of arch (IDOA), difference index (ID), ratio index (IR)), and three machine learning models (i.e., partial least squares regression (PLSR), support vector machine (SVM) and random forest (RF)), were established to inversing coal-derived C content. The applicability of the established optimal inversion model was also examined. It was found that in the wavelength range from 350 nm to 2500 nm, the spectral curves of coal particles are obviously different from those of plant-derived organic matter and coal-free soils. Moreover, the spectral reflectance (R) of coal-contaminated soils decreases with increasing coal-derived C content. These findings provide a theoretical basis for the application of hyperspectral remote sensing technology to quantitatively inverse coal-derived C of mine soils. The CARS-extracted feature wavebands of coal-derived C distributed evenly in the wavelength range between 350 nm and 2500 nm, and the number of feature waveband extracted by the CARS was far higher than that by traditional correlation coefficiency methods. After mathematical transformation of R data, the estimated accuracies of coal-derived C content produced by the IDOA, ID and IR inversion models were significantly enhanced, among which, the DI model of 1/R spectral transformation exhibited the highest estimation accuracy. Compared with the traditional index models, the three machine learning models combined with CARS further enhanced estimation accuracy of coal-derived C content. Among the three machine learning models, the 1/R-CARS-RF inversion model produced the highest estimation accuracy, showing R_\rmm^2 =0.998, RMSE=0.348, RPD=29.943 for its validation set. The applicability test showed that the 1/R-CARS-RF model exhibited a good applicability for different coal-contaminated soils in the Jiaozuo mining area as observed 1.88% for RMSE and 4.97 for RPD. Therefore, it can be expected that hyperspectral remote sensing technology has a promising application prospect for determining coal-derived C content in mine soils. In addition, this study can provide a methodology support for distinguishing coal-derived C from SOC in mine soils and accurately assessing soil carbon sequestration.

     

/

返回文章
返回