Abstract
To meet the demands for coal–rock interface identification in intelligent mining and to overcome the limitations of manual interpretation—such as inefficiency and inconsistency under noisy, real-time, and dynamic conditions—an intelligent recognition method based on ground-penetrating radar (GPR) is proposed, integrating Random Forest (RF), Particle Swarm Optimization (PSO), and Long Short-Term Memory (LSTM) networks. Using the borehole columnar section of the 15219 working face in Xinjing Mine, Yangquan, Shanxi as a reference, a three-layer forward model with an inclined interface (sandstone–coal–mudstone) is constructed by the finite-difference time-domain (FDTD) method. The simulation results provide noise-free reference data for feature analysis and selection. For each trace containing 1,024 sample points, 18 time- and frequency-domain features are extracted. RF is then applied to rank feature importance, and 8 key features with cumulative contribution exceeding 60% are retained: first-order difference, raw-signal amplitude, Hilbert envelope, mean wavelet coefficient, central frequency, signal mean, low-frequency energy, and spectral bandwidth. All features are standardized via Z-score normalization to eliminate dimensional effects. Subsequently, PSO is employed to automatically optimize the hyperparameters of LSTM—as well as, for comparison, those of RNN and SVM. The optimal LSTM hyperparameters are determined as follows: 128 units, an initial learning rate of 3.3×10−3, and a dropout rate of 0.394. Considering the structural characteristics of each model, the input data for both LSTM and RNN are organized into a tensor format of number of samples, 1 024 time steps, 8 features, enabling the models to capture dynamic characteristics and phase continuity in electromagnetic wave propagation through the medium. In contrast, SVM is trained using a corresponding two-dimensional feature matrix. Finally, validation is performed using measured GPR data collected from the lower extraction roadway of the 15219 working face. Results indicate that, on the same test set, the RNN model quantitatively outperforms the LSTM model. To investigate this discrepancy, the prediction results of the LSTM model are further analyzed. In the LSTM-predicted roof interface of the coal seam, discontinuous segments, sharp undulations, and other anomalies inconsistent with the original labels are observed. These anomalies correspond precisely to the influence zones of geological structures—such as collapse columns and watered floor sections—identified in the mining area’s geological survey data. The underlying mechanism is attributed to LSTM’s unique gating architecture, which enables deep extraction of information from both local and global characteristics, as well as from detailed and structural signal features. Consequently, LSTM predictions more accurately reflect the actual geological conditions and demonstrate the ability to transcend reliance on manual labels. In comparison, the RNN model tends to overfit the training labels, while SVM struggles with high-dimensional time-series features, resulting in fragmented recognition outputs. Overall, the findings confirm that the LSTM model offers unique advantages in analyzing the time-series characteristics of GPR data. It not only learns from manual labels to perform coal–rock interface recognition, but also uncovers latent physical patterns within the GPR signals, thereby enabling more comprehensive detection of subtle signal variations caused by geological structures—variations often missed by manual interpretation.