1.School of Resource Engineering, Xi’an University of Architecture and Technology, Xi’an 710055, Shaanxi, China
2.Xi’an Key Laboratory of Intelligent Industrial Perception Computing and Decision-Making, Xi’an University of Architecture and Technology, Xi’an 710055, Shaanxi, China
3.School of Management, Xi’an University of Architecture and Technology, Xi’an 710055, Shaanxi, China
With the extensive implementation of unmanned driving technology in open-pit mines, the challenge of obstacle detection for autonomous trucks operating in complex mining environments has become increasingly significant. To address issues such as low accuracy in multi-scale target detection and inadequate feature fusion for small targets, an obstacle detection model based on MEBP-YOLOv10 is proposed for the forward path of mining trucks. Initially, to improve the model’s feature extraction capabilities cost-effectively, certain C2f modules are substituted with the C2f-MSC module. Drawing inspiration from GhostNet, the C2f-MSC module achieves superior feature maps with a reduced number of parameters. Furthermore, the ECA attention mechanism is integrated into the backbone network. By assigning varying weights to each channel, it captures inter-channel relationships, enhances the extraction of obstacle features, and manages computational costs. Subsequently, to tackle the challenge of detecting small targets, the original PANet in YOLOv10 is replaced with BiFPN. Through bidirectional cross-scale connections and weighted feature fusion, the model’s capacity to integrate features of small obstacle targets is improved. The BiFPN employs a residual architecture, incorporating additional connections between the original input and output nodes within the same layer to preserve the integrity of feature information. Subsequently, the PIoU (Powerful-IoU) loss function is introduced to supplant the bounding box regression loss function in YOLOv10, addressing the limitations of ineffective penalty terms and thereby enhancing the model’s convergence rate and obstacle detection accuracy. Experimental results demonstrate that the algorithm achieves an average detection accuracy of 89.8%, a recall rate of 79.8%, and a mean Average Precision (mAP) of 83.5%. In comparison to the original YOLOv10 model, accuracy and mAP are improved by 4.7% and 4.2%, respectively. Moreover, the model surpasses current mainstream object detection networks in terms of accuracy, recall rate, and other performance metrics. Additionally, the detection speed of this model reaches 103.4 frames per second (FPS), satisfying the real-time requirements for obstacle detection on mining truck paths. The model’s size is merely 6.87 MB, rendering it suitable for deployment on edge devices. Therefore, MEBP-YOLOv10 enables real-time and accurate obstacle detection on mining truck paths in mountainous mining areas, ensuring safe driving for unmanned trucks.
针对露天矿无人矿卡行进道路上的障碍物检测问题,国内外学者采用不同设备开展了大量研究,这些设备可划分为激光雷达检测和视觉检测两大类。其中,激光雷达发射激光束至障碍物,得到障碍物表面具有深度信息的点云数据,通过点云预处理、点云聚类、边界框生成及匹配跟踪等步骤,实现障碍物的精确检测(张静,2022;高明宇等,2025)。然而,激光雷达虽然具有优秀的检测能力,但其存在成本较高以及对灰尘、雨雾等环境因素较为敏感等问题,导致其检测效果与使用场景受到制约(李少博等,2025)。随着计算机视觉与深度学习技术的快速发展,且得益于相机设备成本低、适应能力强的优势,结合视觉与深度学习的障碍物检测方法成为研究热点。与激光雷达不同的是,通过视觉与深度学习进行障碍物检测时需首先通过摄像设备获取障碍物的图像,然后通过训练神经网络,回归预测到图像中障碍物的类别。当前主流方法有YOLO(Zhong et al,2024)、R-CNN(He et al,2017)和DETR(Carion et al,2020)等。视觉检测方法虽展现出显著优势,但也面临矿山复杂环境的严峻挑战,对此,国内外学者从不同层面对目标检测框架进行了优化(卢才武等,2020;白俊卿等,2022)。在数据层面,针对图像质量低和拍摄环境较暗等问题,雷杨等(2025)将自校正照明网格融入YOLOv8网络结构中,在训练过程中,逐步提升图像质量,成功检测出低质量图像中的障碍物。在算法层面,目标检测网络主要由特征提取、特征融合和回归预测3个步骤完成,因此国内外学者从网络结构、模块优化和损失函数等方面进行了改进(阮顺领等,2022;陈良,2025)。露天矿山中的障碍物检测虽然取得了较好的成果,但仍存在一些问题,如:当前露天矿障碍物检测模型中对于障碍物的划分较少,不能满足露天矿区道路上的所有障碍物检测;对于远距离小目标障碍物,当前模型难以识别;高精度障碍物检测模型的运行成本高,对于计算设备需求较为严格,难以部署至边缘设备。
针对露天矿区复杂的环境,模型从4个方面对YOLOv10进行改进,提出一种基于MEBP-YOLOv10(YOLOv10 Enhanced with Multi-scale Convolution,ECA,BiFPN,and PIoU)的障碍物检测模型。首先将主干网络中的C2f模块替换为 C2f-MSC(C2f-Multi-Scale Conv)模块,然后在主干网络中添加高效通道注意力机制(Efficient Channel Attention,ECA)(Wang et al,2020),并将特征融合模块中的PANet(Liu et al,2018)替换为BiFPN(Tan et al,2020);最后,将预测框的回归损失函数由CIoU更换为收敛速度更快且精度更高的PIoU(Liu et al,2024)。MEBP-YOLOv10模型框架如图1所示。
本文将提出的障碍物检测模型与目前主流的目标检测模型进行对比试验,试验结果如表3所示。为了优化模型的训练收敛速度,所有模型均采用迁移学习的方式进行训练,YOLO系列使用经过COCO数据集训练的预训练权重进行迁移学习,输入图片大小为640×640,训练300轮次,YOLOv8(Quan et al,2023)采用n模型,YOLOv9(Bakirci et al,2024)采用原模型,EfficientDet基于ImageNet数据集进行迁移学习,输入图片大小为512×512,训练100轮次。以上模型均使用训练得到效果最好的权重文件进行障碍物检测,检测部分结果如图7所示。
由表5可知,EMA(Ouyang et al,2023)(Efficient Muliti-Scale Attention)引入指数移动平均,将先前时刻的注意力信息累积,使模型更注重历史上下文,但因此需要维护和更新历史信息,增加了计算开销。CBAM(Woo et al,2018)(Convolu-tional Block Attention Module)对通道和空间维度分别加权,使得网络可以更加准确地捕捉重要的特征,从而提升模型的整体性能。相较于其他主流注意力机制,ECA注意力机制在不增加模型复杂度的情况下,检测的精度也得到明显的提高。使用GradCAM(Selvaraju et al,2020)方法得出不同的注意力机制的热力图,如图9所示。
由图9可知,SimAM(Yang et al,2021)(Simple Attention Module)在计算注意力时,引入相似性度量,通过度量特征之间的相似性来调整注意力分配,导致容易受到噪声与不相关信息的干扰,使其得到的热力图中特征过于分散,并不集中于待检测的目标上。相比之下,EMA因过于集中于待检测的目标上,而丢失了周围的其他特征信息,ECA、CBAM和SE(Tang et al,2020)(Squeeze-and-Excitation)能够有效对目标特征进行采样,得到较好的效果。
BakirciM, DmytrovychP, BayraktarI,et al,2024.Challenges and advances in UAV-Based vehicle detection using YOLOv9 and YOLOv10[C]//7th IEEE International Conference on Actual Problems of Unmanned Aerial Vehicles Development.Ukraine:Institute of Electrical and Electronics Engineers Inc.
[2]
CarionN, MassaF, SynnaeveG,et al,2020.End-to-end object detection with transformers[M]//Computer Vision-ECCV 2020.Cham:Springer International Publishing.
[3]
HeK M, GkioxariG, DollárP,et al,2017.Mask R-CNN[C]//2017 IEEE International Conference on Computer Vision (ICCV).Venice,Italy: IEEE.
[4]
LiuC, WangK G, LiQ,et al,2024.Powerful-IoU:More straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism[J].Neural Networks,170:276-284.
[5]
LiuS, QiL, QinH F,et al,2018.Path aggregation network for instance segmentation[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City: IEEE.
[6]
OuyangD L, HeS, ZhangG Z,et al,2023.Efficient multi-scale attention module with cross-spatial learning[C]//2023 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).Rhodes Island,Greece:IEEE.
[7]
QuanY, WangP, WangY,et al,2023.GUI-Based YOLOv8 license plate detection system design[C]//5th International Conference on Control and Robotics,ICCR 2023. Tokyo:IEEE.
[8]
SelvarajuR R, CogswellM, DasA,et al,2020.Grad-CAM:visual explanations from deep networks via gradient-based localization[J].International Journal of Computer Vision,128(2):336-359.
[9]
TanM X, PangR M, LeQ V,2020.EfficientDet:scalable and efficient object detection[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Seattle:IEEE.
[10]
TangZ, LiuX, LiY,et al,2020.Multi-atlas brain parcellation using squeeze-and-excitation fully convolutional networks[J]. IEEE Transactions on Image Processing,29:6864-6872.
[11]
WangQ L, WuB G, ZhuP F,et al,2020.ECA-net:efficient channel attention for deep convolutional neural networks[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Seattle:IEEE.
YangL, ZhangR Y, LiL,et al,2021. SimAM:a simple,parameter-free attention module for convolutional neural networks[C]//38th International Conference on Machine Learning,ICML 2021.Virtual,Online:ML Research Press.
[14]
ZhongH, ChenZ, ShangQ,et al,2024.YOLOv10 improvement approach based on ros robot visual obstacle avoidance system[C]//9th International Conference on Cyber Security and Information Engineering,ICCSIE 2024. Hybrid,LumpurKuala,Malaysia:Association for Computing Machinery.
GaoMingyu, BaoJiusheng, YinYan,et al,2025.Detour-Straddle 3D-like path planning of unmanned mining truck in open pit mines based on optimized ant colony algorithm[J].Coal Science and Technology,53():399-411.
LeiYang, HeJiang, QinLijie,et al,2025.Low light target detection technology for underground unmanned electric vehicles based on SCI-YOLOv8[J].Metal Mine,54(2):172-179.
GuQinghua, RuanShunling,et al,2025.A visual multi-task perception method for unmanned vehicles in open pit mines incorporating wavelet transforms[J/OL].Journal of China Coal Society:1-13.(2025-03-18).
RuanShunling, DongLijuan, LuCaiwu,et al,2022.Research on infrared obstacle detection of mine roadway based on RCR_YOLOv4[J].Gold Science and Technology,30(4):603-611.
WangGuofa, WangHong, RenHuaiwei,et al,2018.2025 scenarios and development path of intelligent coal mine[J].Journal of China Coal Society,43(2):295-305.
[31]
张静,2022.基于多传感器融合的露天矿山障碍物检测方法研究[D].长沙:湖南大学. Zhang Jing,2022.Research on obstacle detection methods for open-pit mines based on multi-sensor fusion[D]. Changsha:Hunan University.