To address the issues of low efficiency, blind spots in vision, and high costs of traditional manual security patrols, a security patrol system based on the DiMP (discriminative model prediction) algorithm was designed. The system adopted a modular design and implements autonomous flight control and tracking functions for UAV (unmanned aerial vehicle) on an embedded onboard computer. To enhance the tracking precision and accuracy of small targets during the patrol process, a multi-scale feature fusion strategy was employed to improve the DiMP target tracking algorithm. This strategy involved fusing image pyramid features of different scales with the backbone network features, providing the backbone network with information-rich fused features. The optimized DiMP algorithm achieved a 2.6% increase in target tracking success rate and a 3.4% increase in precision on the UAV123 dataset, while also reaching a tracking speed of 38 fps on the VOT2018 dataset. Finally, the effectiveness of the UAV security patrol was verified in an outdoor environment. The results show that the improved tracking algorithm is capable of operating in real time on the UAV and stably tracking the target for a long time.
然而,随着巡检任务复杂性的增加,对无人机系统的自主性要求也随之提高。目前,许多系统仍依赖遥控操作,缺少成熟的自主巡检技术。虽然Liang等[7]通过在地面站对数据进行处理,减轻了机载计算机在目标跟踪中的计算压力,但这种方法有传输延迟,影响了系统的实时性。随着机载嵌入式系统计算能力的提升,集成计算机视觉功能的无人机开始受到研究者的重视。例如,通过嵌入式系统实现的相关滤波(kernelized correlation filters,KCF)算法[8]、高效卷积运算(efficient convolution operators,ECO)[9]等。Çintaş等[10]提出了一种基于YOLO(you only look once)[11]目标检测算法和KCF目标跟踪算法的嵌入式系统方案,实现了无人机在飞行中的自动目标检测和跟踪,但这些方法在精度和准确性上仍有待提高。随着专为深度学习而设计的模块化机载计算机的出现,无人机自主安防巡检成为可能。
在安防巡检无人机系统中,准确稳定的目标跟踪是完成任务的关键。由于传统的相关滤波器跟踪方法缺乏鲁棒的特征表示,导致跟踪结果不够准确。近年来,基于孪生网络(siamese network,SN)[12]的跟踪算法因其在视觉跟踪的性能和效率之间取得了令人满意的平衡而受到广泛的关注。但在遇到干扰物或目标外观变化时,其跟踪效果会下降。为了克服这一缺点,Bhat等[13]提出DiMP(discriminative model prediction)算法,通过引入判别式跟踪架构,充分地利用了目标和背景的外观信息,在有效处理外观变化的同时,极大提高了算法的跟踪性能。同时,DiMP采用最速梯度下降法对模型进行优化,克服了计算资源限制,在无人机平台上实现了深度学习跟踪算法的应用。鉴于DiMP目标跟踪方法在速度和精度上的优势,其改进方法不断涌现。Danelljan等[14]提出了PrDiMP算法,从概率回归公式的角度提高了目标分类器的判别能力,在处理复杂场景和目标遮挡时取得了较好的结果。但基于DiMP跟踪方法目标分类器仅使用来自预训练模型骨干网络中的单层特征,导致其区分目标的能力有限,难以用于图像细节的提取。针对此问题,Wang等[15]将Transformer架构应用到了跟踪领域,提出TrDiMP算法,该算法通过结合不同帧之间的信息来获取更丰富的语义信息,从而提高了跟踪精度。但TrDiMP算法训练和推理时间的增加使其难以在嵌入式机载电脑上实时运行,这在无人机自主跟踪任务中是一个显著的缺点。
为了更直观地展示本文改进的跟踪算法对小目标的跟踪效果,图6将本文的方法与DiMP50和基于重叠最大化的精准跟踪(accurate tracking by overlap maximization,ATOM)[18]的跟踪结果进行对比,并针对UAV123数据集中的3个具有挑战性的序列bike3、bird1、group2进行了分析。
FanJ W, YangX G, LuR T,et al.Design and implementation of intelligent inspection and alarm flight system for epidemic prevention[J].Drones,2021,5(3):68.
[7]
LiangX, ZhaoS R, ChenG D,et al.Design and development of ground station for UAV/UGV heterogeneous collaborative system[J].Ain Shams Engineering Journal,2021,12(4):3879-3889.
DanelljanM, BhatG, KhanF S,et al.ECO:efficient convolution operators for tracking[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Honolulu:IEEE,2017:6931- 6939.
[10]
ÇintaşE, ÖzyerB, ŞimşekE.Vision-based moving UAV tracking by another UAV on low-cost hardware and a new ground control station[J].IEEE Access,2020,8:194601-194611.
[11]
InuiA, MifuneY, NishimotoH,et al.Detection of elbow OCD in the ultrasound image by artificial intelligence using YOLOv8[J].Applied Sciences,2023,13(13):7623.
[12]
FuK R, FanD P, JiG P,et al.Siamese network for RGB-D salient object detection and beyond[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,44(9):5541-5559.
[13]
BhatG, DanelljanM, Van GoolL,et al.Learning discriminative model prediction for tracking[C]//2019 IEEE/CVF International Conference on Computer Vision (ICCV).Seoul:IEEE,2019:6181-6190.
[14]
DanelljanM, Van GoolL, TimofteR.Probabilistic regression for visual tracking[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Seattle:IEEE,2020:7181-7190.
[15]
WangN, ZhouW G, WangJ,et al.Transformer meets tracker: exploiting temporal context for robust visual tracking[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Nashville:IEEE,2021:1571-1580.
[16]
PratamaY, GintingL M, LaurenciaE H,et al.Face recognition for presence system by using residual networks-50 architecture[J].International Journal of Electrical and Computer Engineering,2021,11(6):5488.
[17]
LiuZ M, GaoG Y, SunL,et al.IPG-Net:image pyramid guidance network for small object detection[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).Seattle:IEEE,2020:4422-4430.
[18]
DanelljanM, BhatG, KhanF S,et al.ATOM:accurate tracking by overlap maximization[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Long Beach:IEEE,2019:4655-4664.