Forest roads are prone to developing defects such as cracks and potholes due to their natural environmental conditions and the heavy vehicle loads they carry, resulting in poor road conditions and high maintenance costs. To address the challenges of inaccurate target detection bounding boxes, significant scale variations of pavement distresses under UAV perspectives, and insufficient lighting conditions, a bimodal asphalt pavement distress detection method (bimodal integrated road getection YOLOv8, BIRD-YOLOv8) was proposed. It employed an intermediate fusion strategy combining visible and infrared images. The DynaSpectra fusion module (DSFM), constructed by serially connecting adaptive fine-grained channel attention (FCAttention) and linear deformable convolution (LDConv), replaced the C2f structure in BIRD-YOLOv8's backbone network, enhancing feature extraction capability for distress areas. Normalized Wasserstein distance loss (NWDLoss) was introduced to replace CIoU, strengthening the model's detection ability for small-scale distresses. Experimental results showed that the improved algorithm achieved an mAP of 83.3%, with AP values for transverse cracks, longitudinal cracks, alligator cracks, and potholes reaching 88%, 91.3%, 90.5%, and 63.5%, respectively, laying a foundation for the identification and maintenance of pavement distresses in forest roads.
式中:C、H、W分别为通道数、长度和宽度; U 为全局通道信息向量;表示全局平均池化操作;n 为通道索引;i和j分别为行索引、列索引。
为了在保持较少参数量的同时增强局部通道信息建模能力,引入了带状矩阵 B 进行局部通道交互,并通过经过局部通道交互后的通道特征 Ulc计算局部信息,公式为
。
。
式中:k为相邻通道交互的数量; U 为全局通道信息描述符; bi 为通道间交互权重矩阵; Ulc为经过局部通道交互后的通道特征。
与传统的SE(squeeze-and-excitation)通道注意力机制相比,FCAttention通过引入局部通道交互机制,在特征权重分配策略上进行优化,使模型能够更精细地捕捉通道间的依赖关系。这种改进有效缓解了SE机制中通道权重分配可能存在的不稳定性或信息丢失的问题,提高了模型对关键特征的关注度。FCAttention结合了全局平均池化(global average pooling,GAP)和局部通道交互策略,使其不仅能够捕捉全局信息,还能通过带状矩阵和对角矩阵对通道进行细粒度建模,实现更精准的特征增强。在极端天气条件下,该机制能够更有效地区分重要特征与噪声,提高模型在恶劣环境中检测的稳定性,FCAttention的结构如图3所示(θ为可学习因子;σ为激活函数)。
HANJ W, CUIY N, LIJ D,et al.Microstructure and rheological properties at low temperature of modified asphalt under salt freezing cycle[J].Acta Materiae Compositae Sinica,2016,33(8):1718-1724.
TANY Q, ZHAOL D, LANB W,et al.Performance of asphalt mixture under repeated condensation of ice[J].Journal of Building Materials,2011,14(6):761-766,792.
[5]
傅广文.融雪剂对沥青及沥青混合料性能影响研究[D].长沙:长沙理工大学,2010.
[6]
FUG W.Research on influence of snowmelt agent to performances of asphalt and asphalt mixture[D].Changsha:Changsha University of Science & Technology,2010.
ZHUY H, SUNX Y, WANGM,et al.Multi-modal feature pyramid transformer for RGB-infrared object detection[J].IEEE Transactions on Intelligent Transportation Systems,2023,24(9):9984-9995.
[17]
ZHANGH, FROMONTE, LEFEVRES,et al.Multispectral fusion for object detection with cyclic fuse-and-refine blocks[C]//2020 IEEE International Conference on Image Processing (ICIP).October 25-28,2020.Abu Dhabi,United Arab Emirates.IEEE,2020:276-280.
[18]
ZHAOM, ZHANGH R.An infrared object detection method based on cross-domain fusion network[J].Acta Photonica Sinica,2021,50(11):1110001.
CAOY, BINJ C, HAMARIJ,et al.Multimodal object detection by channel switching and spatial attention[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).June 17-24,2023.Vancouver,BC,Canada.IEEE,2023:403-411.
[21]
ZHANGX, ZHANGX H, WANGJ T,et al.TFDet:target-aware fusion for RGB-T pedestrian detection[J].IEEE Transactions on Neural Networks and Learning Systems,2025,36(7):13276-13290.
[22]
LIY W, YUA W, MENGT J,et al.DeepFusion:lidar-camera deep fusion for multi-modal 3D object detection[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).June 18-24,2022.New Orleans,LA,USA.IEEE,2022:17182-17191.
SUNH, WENY, FENGH J,et al.Unsupervised bidirectional contrastive reconstruction and adaptive fine-grained channel attention networks for image dehazing[J].Neural Networks,2024,176:106314.
[25]
ZHANGX, SONGY Z, SONGT T,et al.LDConv:Linear deformable convolution for improving convolutional neural networks[J].Image and Vision Computing,2024,149:105190.
[26]
DEVAGUPTAPUC, AKOLEKARN, SHARMAM M,et al.Borrow from anywhere:Pseudo multi-modal object detection in thermal imagery[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).June 16-17,2019.Long Beach,CA,USA.IEEE,2019:1029-1038.
[27]
SUNY M, CAOB, ZHUP F,et al.Drone-based RGB-infrared cross-modality vehicle detection via uncertainty-aware learning[J].IEEE Transactions on Circuits and Systems for Video Technology,2022,32(10):6700-6713.