多尺度注意力增强的热红外图像露天矿障碍物深度估计

Multi-scale attention-enhanced thermal infrared imaging for obstacle depth estimation in open-pit mines

  • 摘要: 深度信息是无人矿卡在露天矿复杂恶劣环境下环境感知的重要组成部分,采用激光雷达或双目系统获取环境深度信息时,易被露天矿粉尘、雨雾和复杂光线等因素影响。因此,提出一种基于热红外图像的障碍物深度估计方法,首先通过深度补全技术将稀疏深度图转化为稠密深度图作为监督标签,随后基于热红外图像实现单目深度估计,提升无人矿卡在恶劣环境中深度信息获取能力。提出一种结合拉普拉斯金字塔深度残差(Laplacian Pyramid Depth Residual)和跃层连接的网络结构,显著提升深度估计的准确性。该结构通过多尺度深度信息捕捉不同层次的细节特征,并利用跃层连接(Hierarchical Connection,HC)增强细节恢复能力,使模型能更好地处理复杂场景中的细微变化。为进一步提升模型性能,引入内容引导注意力(Content-Guided Attention,CGA)融合模块(CGA-Driven Cross-layer Interaction,CGA-DCI),通过改进高频和低频信息的融合过程,避免简单拼接方法导致的信息丢失问题,提升深度估计的精度和鲁棒性。此外,提出细节增强注意力模块(Detail-Enhanced Attention Blocks,DEAB)强化对图像细节的提取,使模型能够有效应对露天矿环境中粉尘、雨雾和复杂光线等干扰,在恶劣条件下仍保持较高的深度估计准确性。实验结果显示,结合拉普拉斯金字塔深度残差和跃层连接,CGA-DCI和DEAB模块的模型在矿卡深度估计任务中, \delta _1.25 为83.3%,AbsRel为15.2%,相较于主流的深度估计方法,估计精度有所提升,能够为无人矿卡在复杂恶劣环境中提供全时域的深度信息。

     

    Abstract: Depth information is an important component for the environmental perception of unmanned mining trucks in the complex and harsh environment of open-pit mines. When obtaining environmental depth information using LiDAR or stereo systems, it is prone to be affected by factors such as dust in open-pit mines, rain and fog, and complex light conditions. Therefore, this paper proposes an obstacle depth estimation method based on thermal infrared images. Firstly, the sparse depth map is transformed into a dense depth map by depth completion technology as a supervision label, and then the monocular depth estimation is realized based on thermal infrared images to improve the depth information acquisition ability of unmanned mining trucks in harsh environments. The paper proposes a network structure that combines Laplacian Pyramid Depth Residual and jump layer connection, significantly improving the accuracy of depth estimation. This structure captures different levels of detail features through multi-scale depth information and enhances the detail restoration ability by using jump layer connection, enabling the model to better handle subtle changes in complex scenes. To further improve the model performance, the paper introduces a content-guided attention (CGA) fusion module (CGA-Driven Cross-layer Interaction, CGA-DCI), which improves the fusion process of high-frequency and low-frequency information to avoid the information loss problem in simple stitching methods and enhances the accuracy and robustness of depth estimation. In addition, the proposed detail-enhanced attention blocks (DEAB) strengthen the extraction of image details, enabling the model to effectively handle interference such as dust, rain and fog, and complex light conditions in open-pit mines, maintaining high depth estimation accuracy even in harsh environments. Experimental results demonstrate that our model, integrating Laplacian pyramid depth residuals with skip connections, along with CGA-DCI and DEAB modules, achieves 83.3% on \delta _1.25 and an AbsRel of 15.2% for mining truck depth estimation. This performance surpasses mainstream depth estimation methods and can provide reliable, real-time depth information for autonomous mining trucks in challenging environments.

     

/

返回文章
返回