To address the challenging issue of real-time detection of small distant pedestrians and vehicles in images captured by roadside cameras on expressways, an improved object detection algorithm YOLOv5s-3S-4PDH was proposed. Firstly, the Shufflenetv2-Stem-SPPF network structure was used to improve the running speed of the algorithm. Secondly, the accelerated normalized weighted fusion feature map and the 160×160 small object detection layer were introduced to optimize the performance of small object detection; Then, the improved decoupling head mechanism was introduced to improve the localization and classification accuracy of small object detection. Finally, Focal EIoU was used as the localization loss function of the algorithm to accelerate the training convergence speed of the algorithm. The results show that: compared with the YOLOv5s on the self-built pedestrian and vehicle dataset, the computation and parameter amount of the proposed algorithm are reduced by 10.1% and 24.6%, respectively, and the detection speed and accuracy are increased by 15.4% and 2.1%, respectively; Transfer learning experiment on the VisDrone2019 dataset shows that the proposed algorithm has better average precision for all categories. The proposed algorithm not only meets the real-time and accuracy requirements of small object detection, but also has generalization ability.
YangHui-jian, MengLiang. A small target detection algorithm based on improved YOLOv5 in aerial image[J]. Computer Engineering & Science, 2023, 45(6): 1063-1070.
[9]
RenS Q, HeK M, GirshickR, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
[10]
SinghB, DavisL S. An analysis of scale invariance in object detection-SNIP[C]∥IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018:3578-3587.
[11]
ZhangS, ZhuX, LeiZ, et al. Faceboxes:a CPU real-time face detector with high accuracy[C]∥IEEE International Joint Conference on Biometrics, Denver, USA, 2017: 1-9.
WangJian-zhong, WangJia-le, YuZi-bo, et al. Multi-scale detection method for soldier and armored vehicle objects[J]. Transactions of Beijing Institute of Technology, 2023, 43(2): 203-212.
LiCheng-hao, ZhangJing, HuLi, et al. Small object detection algorithm based on multiscale receptive field fusion[J]. Computer Engineering and Applications, 2022, 58(12): 177-182.
DongYa-pan, GaoChen-qiang, ChenFang, et al. Infrared small target detection method based on attention mechanism[J]. Journal of Chongqing University of Posts and Telecommunications (Natural Science Edition), 2023, 35(2): 219-226.
[20]
QuJ S, SuC, ZhangZ W, et al. Dilated convolution and feature fusion SSD network for small object detection in remote sensing images[J]. IEEE Access, 2020, 8: 82832-82843.
[21]
LiK, ChengG, BuS, et al. Rotation-insensitive and context-augmented object detection in remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 56(4): 2337-2348.
[22]
RedmonJ, DivvalaS, GirshickR, et al. You only look once: unified, real-time object detection[C]∥IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 779-788.
[23]
RedmonJ, FarhadiA. YOLO9000: better, faster, stronger[C]∥IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 6517-6525.
[24]
RedmonJ, FarhadiA. YOLOv3: an incremental improvement[C]∥IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018:1-6.
[25]
BochkovskiyA, WangC Y, LiaoH. YOLOv4: optimal speed and accuracy of object detection[DB/OL].[2023-06-05].
[26]
LiuS, QiL, QinH, et al. Path aggregation network for instance segmentation[C]∥IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 8759-8768.
[27]
MaN, ZhangX, ZhengH T, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices[C]∥IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 116-131.
[28]
YuC, GaoC, WangJ, et al. BiSeNet V2: bilateral network with guided aggregation for real-time semantic segmentation[J]. International Journal of Computer Vision, 2021, 129: 3051-3068.
ChenKui, LiuXiao, JiaLi-jiao, et al. Insulator defect detection based on lightweight network and enhanced multi-scale feature fusion[J].高压技术,2024(3):1289-1300.
[31]
TanM, PangR, LeA V. EfficientDet: scalable and efficient object detection[C]∥IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020:10778-10787.
GaoXin-bo, Jing-chengMomeng, WangHai-tao, et al. Recent advances in small object detection[J]. Journal of Data Acquisition and Processing, 2021, 36(3):391-417.
[34]
LinT Y, DollarP, GirshickR, et al. Feature pyramid networks for object detection[C]∥IEEE Conference on Computer Vision and Pattern Recognition Honolulu, USA, 2017: 2117-2125.
[35]
SongG, LiuY, WangX. Revisiting the sibling head in object detector[C]∥IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 11560-11569.
[36]
GeZ, LiuS, WangF, et al. YOLOX: exceeding YOLO series in 2021[DB/OL]. [2023-06-10].
[37]
ZhangY F, RenW, ZhangZ, et al. Focal and efficient IOU loss for accurate bounding box regression[J]. Neurocomputing, 2022, 506: 146-157.
XuHui-zhi, SongAi-qiu, WuXiao-yu. Training method of deep learning to ship target detection based on uniform design[J]. Science Technology and Engineering, 2022, 22(25) : 11241-11249.