一种基于ViT技术的被遮挡行人目标重识别方法

高梦兴; 肖满生; 许雅婷; 刘振桢

doi:10.20009/j.cnki.21-1106/TP.2025-0082

小型微型计算机系统 ›› 2026, Vol. 47 ›› Issue (5) : 1219 -1224. DOI: 10.20009/j.cnki.21-1106/TP.2025-0082

计算机图形与图像

一种基于ViT技术的被遮挡行人目标重识别方法

高梦兴, 肖满生, 许雅婷, 刘振桢

作者信息 +

ViT-based Method for Occluded Pedestrian Re-identification

GAO Mengxing, XIAO Mansheng, XU Yating, LIU Zhenzhen

Author information +

文章历史 +

摘要

行人目标重识别(ReID)是指在不同场景中匹配同一行人目标的技术.针对在有遮挡物的情况下依赖全局信息方式处理行人目标细节特征时,出现的局部信息表达能力受限问题,提出了一个基于ViT特征增强的ReID方法,主要包括:1)设计一个新型的跨尺度空洞融合模块(Dimensional Feature Reinforcement Module,CDFM),通过多维度重加权对输入特征进行优化,提升特征表达能力;2)提出一个全局与局部特征协同算法,用以提升模型的性能和鲁棒性;.该方法结合了Transformer模块对全局依赖的建模能力和CNN在捕获局部细节特征上的优势,从而增强了特征信息的流动性和表达能力;3)提出一个动态加权损失函数,通过可见区域感知对比机制明确增强可见区域特征一致性,引入动态难例采样策略缓解遮挡噪声干扰,并融合通道注意力权重优化特征对齐,进一步提升模型在遮挡场景下的判别力.实验结果表明,所提出的方法在多个主流有遮挡的ReID数据集上表现出更强的性能优势.

Abstract

Person Re-Identification (ReID) refers to the technology of matching the same pedestrian across different scenarios.To address the limitations of local feature representation caused by relying solely on global information in handling pedestrian details under occlusion scenarios,this paper proposes a ViT-enhanced ReID method.The key contributions include:1) A novel Cross-scale Dilated Fusion Module (CDFM) that optimizes input features through multi-dimensional re-weighting and integrates multi-scale dilated convolutional branches to enhance feature discriminability;2) A Global-Local Feature Collaboration Module combining Transformer blocks and lightweight CNN layers to leverage the complementary strengths of global dependency modeling by Transformers and local detail feature extraction by CNNs,thereby improving feature fusion and robustness;3) A Dynamic Weighted Loss Function that introduces a visibility-aware contrastive learning mechanism to enforce consistency in visible regions,adopts a dynamic hard example mining strategy to mitigate occlusion-induced noise interference,and incorporates channel attention weights for refined feature alignment,significantly enhancing discriminative power in occlusion scenarios.Experimental results demonstrate that the proposed method achieves superior performance on multiple mainstream occluded ReID benchmarks,including Occluded-Duke and Occluded-REID,outperforming existing state-of-the-art methods in both Rank-1 accuracy and mAP metrics.

关键词

行人重识别 / ViT / 跨尺度空洞融合 / 全局与局部特征协同 / 动态加权损失

Key words

Person Re-Identification(ReID) / ViT / cross-scale dilated fusion / lightweight convolution / contrastive weighting loss

引用本文

引用格式 ▾

高梦兴, 肖满生, 许雅婷, 刘振桢. 一种基于ViT技术的被遮挡行人目标重识别方法[J]. 小型微型计算机系统, 2026, 47(5): 1219-1224 DOI:10.20009/j.cnki.21-1106/TP.2025-0082

登录浏览全文

4963

注册一个新账户忘记密码

参考文献

[1] Ye Mang,Shen Jianbing,Lin Gaojie,et al.Deep learning for person re-identification:a survey and outlook[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,44(6):2872-2893.
[2] Zhang Guoqing,Yang Shan,Wang Hairui,et al.Multi-modal person re-identification based on deep learning:a review[J].Journal of Nanjing University of Information Science & Technology,2024,81(1):25-51.
[3] Wang Jiahe,Gao Xizhan,Zhu Fa,et al.Exploring frontier technologies in video-based person re-identification:a survey on deep learning approach[J].Computers,Materials & Continua,2024,79(3):4123-4145.
[4] Sun Yifan,Zheng Liang,Yang Yi,et al.Beyond part models:person retrieval with refined part pooling[C]//European Conference on Computer Vision,Munich:Springer,2018:480-496.
[5] Zhou Yu,Zhao Xiaofeng,Wang Yi,et al.Multi-scale occluded person re-identification guided by key fine-grained information[J].Journal of Electronics & Information Technology,2024,46(6):2578-2586.
[6] Liu Zhigang,Wang Qi,Zhao Yijun,et al.Occluded person re-identification with pose estimation correction and feature reconstruction[J].IEEE Access,2023,11(2):14906-14914.
[7] Zheng Liang,Shen Liyue,Tian Lu,et al.Scalable person re-identification:a benchmark[C]//IEEE International Conference on Computer Vision,2015:1116-1124.
[8] Yang Zhenzhen,Chen Yanan,Yang Yongpeng,et al.Robust feature mining transformer for occluded person re-identification[J].Digital Signal Processing,2023,141(10):104166,doi:10.1016/j.dsp.2023.104166.
[9] Jiang Yi,Xu Jiajie,Yang Baoqing,et al.Image inpainting based on generative adversarial networks[J].IEEE Access,2020,8(1):22884-22892.
[10] Bian Yuan,Liu Min,Wang Xueping,et al.Occlusion-aware feature recover model for occluded person re-identification[J].IEEE Transactions on Multimedia,2024,26(11):5284-5295.
[11] Zhang Wenfeng,Huang Lei,Wei Zhiqiang,et al.Appearance feature enhancement for person re-identification[J].Expert Systems with Applications,2021,163:113771,doi:10.1016/j.eswa.2020.113771.
[12] Vaswani A,Shazeer N,Parmar N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems,Long Beach:Curran Associates,2017:5998-6008.
[13] Luo Hao,He Shuting,Wang Pichao,et al.TransReID:transformer-based object re-identification[C]//IEEE International Conference on Computer Vision,2021:14993-15002.
[14] Wang Qi,Wang Jianjun,Deng Hongyu,et al.AA-trans:core attention aggregating transformer with information entropy selector for fine-grained visual classification[J].Pattern Recognition,2023,140(8):109547,doi:10.1016/j.patcog.2023.109547.
[15] Li Yanping,Liu Yizhang,Zhang Hongyun,et al.Occlusion-aware transformer with second-order attention for person re-identification[C]//IEEE Transactions on Image Processing,2024:3200-3211.
[16] Zhang Xin,Fu Keren,Zhao Qijun.Dynamic patch-aware enrichment transformer for occluded person re-identification[J].arXiv preprint arXiv,2024:2402.10435.
[17] Sun Yifan,Zheng Liang,Yang Yi,et al.Beyond part models:person retrieval with refined part pooling[C]//European Conference on Computer Vision,2018:501-518.
[18] Zheng Zhedong,Zheng Liang,Yang Yi,et al.Unlabeled samples generated by GAN improve the person re-identification baseline in vitro[C]//IEEE International Conference on Computer,2017:3774-3782.
[19] Ergys Ristani,Francesco Solera,Roger Zou,et al.Performance measures and a data set for multi-target,multi-camera tracking[C]//European Conferenceon Computer Vision,2016:17-35.
[20] Zhuo Jiaxuan,Chen Zeyu,Lai Jianhuang,et al.Occluded person re-identification[C]//IEEE International Conference on Multimedia and Expo,2018:1-6.
[21] He Linxiao,Wang Yinggang,Liu Wu,et al.Foreground-aware pyramid reconstruction for occluded person re-identification[C]//IEEE International Conference on Computer Vision,2019:8449-8458.
[22] Luo Hao,Jiang Wei,Gu Youzhi,et al.A strong baseline and batch normalization neck for deep person re-identification[J].IEEE Transactions on Image Processing,2020,29(12):4022-4035.
[23] Gao Shang,Wang Jingya,Lu Huchuan,et al.Pose-guided visible part matching for occluded person reid[C]//IEEE Conference on Computer Vision and Pattern Recognition,2020:11744-11752.
[24] Li Yulin,He Jianfeng,Zhang Tianzhu,et al.Diverse part dis-covery:occluded person re-identification with part-aware transformer[C]//IEEE Conference on Computer Vision and Pattern Recognition,2021:2897-2906.
[25] Wang Tao,Liu Hong,Song Pinhao,et al.Pose-guided feature disentangling for occluded person re-iden-tification based on transformer[C]//AAAI Conference on ArtificialIntelligence,2022:2540-2549.