Abstract:
The coal-gangue sorting robot plays a vital role in promoting intelligent coal mining, where accurate and efficient coal-gangue recognition is the core technology. However, traditional recognition methods often struggle under complex conditions such as high noise, motion blur, and low illumination, leading to reduced accuracy and efficiency. To address these limitations, we propose a coal-gangue recognition method based on SegFormer-CG, which significantly enhances real-time performance and recognition accuracy. The model adopts the Transformer-based SegFormer framework, utilizing the lightweight MiT-B0 as the encoder to extract multi-scale features. The decoder integrates multiple enhancement modules: Bottleneck modules are introduced after the C1, C2, and C3 feature maps to improve feature extraction, which are further optimized using Depthwise Separable Convolution (DSConv) and Omni-Dimensional Dynamic Convolution (ODConv) to reduce parameter size and computational cost. An Atrous Spatial Pyramid Pooling (ASPP) module, improved with DSConv and 5×5 convolutions, is added to the C4 feature map to enhance multi-scale feature fusion. Additionally, Criss-Cross Attention (CCA) modules are applied to the C3 and C4 feature maps, enabling the model to focus on critical spatial information. A two-stage transfer learning strategy is employed: first, freezing the encoder for 50 epochs to adapt features, and then fine-tuning the entire network to enhance generalization. Experimental results demonstrate that SegFormer-CG achieves 96.39% precision, 96.29% recall, and 93.03%
mIoU, with improvements of 1.32%, 0.59%, and 1.73% over the baseline model, respectively. It maintains a lightweight structure with 5.14×10
6 parameters, 5.90×10
9 FLOPs, and a high inference speed of 50.92 FPS. Compared with classical models such as PSPNet, DeepLabV3+, and UNet, SegFormer-CG achieves superior performance in both accuracy and efficiency. Furthermore, the model shows strong robustness and generalization under challenging conditions, making it a reliable technical solution for practical coal-gangue sorting.