基于聚合重构数据增强和多维特征知识转移的小样本目标检测

吴家骏; 王一; 朱松豪

doi:10.20009/j.cnki.21-1106/TP.2025-0127

小型微型计算机系统 ›› 2026, Vol. 47 ›› Issue (5) : 1190 -1197. DOI: 10.20009/j.cnki.21-1106/TP.2025-0127

计算机图形与图像

基于聚合重构数据增强和多维特征知识转移的小样本目标检测

吴家骏, 王一, 朱松豪

作者信息 +

Few-shot Object Detection Via Aggregation-reconstruction Data Augmentation and Multidimensional Feature Knowledge Transfer

WU Jiajun, WANG Yi, ZHU Songhao

Author information +

文章历史 +

摘要

近年来,目标检测获得了广泛关注和研究,并取得许多成果.然而,要想获得一个性能优越的检测模型,需要大量标签样本进行训练.与之形成鲜明对比的是,人类仅需借助少量示例,就能快速学习新知识.为缩小两者间的差距,小样本目标检测得到越来越多关注.小样本目标检测方法旨在通过数量有限的标签样本,实现新类知识学习,且在此过程中,不会灾难性遗忘先前学习的基类知识,进而提升新类检测性能.然而,现有的小样本目标检测方法存在以下问题:1)过度关注模型精度而忽略了模型效率;2)只关注模型分类性能而忽略了模型定位性能.为解决这些问题,本文提出了一种新颖的基于聚合重构数据增强和多维特征知识转移的小样本目标检测方法.具体而言,首先提出聚合重构数据增强策略,通过从生成图像中提取目标对象,进行缩放后聚合在随机选择的基类样本中,从而在增加数据多样性、缓解数据稀缺的同时,增强模型对于不同数据集的泛化能力.然后,进行类间语义特征知识转移,实现分类器权值理想初始化,提高模型收敛速度;并显式建模类间定位特征知识,提高模型定位能力.实验结果表明,本文方法在小样本目标检测任务中表现良好,与现有方法相比具有一定的竞争力.

Abstract

In recent years,object detection has received widespread attention and research and has achieved many results.However,to obtain a high-performance detection model,a large number of labeled samples are required for training.In sharp contrast,humans can quickly learn new knowledge with only few examples.To narrow the gap between these two,few-shot object detection has received increasing attention.The few-shot object detection method aims to achieve new class knowledge through a limited number of annotated samples,without catastrophically forgetting previously learned base class knowledge,thereby improving the performance of new class detection.However,existing few-shot object detection methods have the following problems:1)Excessive focus on model accuracy while neglecting model efficiency;2)Only focusing on model classification performance while neglecting model localization performance.To address these issues,this paper proposes a novel few-shot object detection method based on aggregation-reconstruction data augmentation and multidimensional feature knowledge transfer.Specifically,an aggregation reconstruction data augmentation strategy is proposed,which extracts specific objects from generated images,scales them,and aggregates them into randomly selected base class samples.This enhances the model′s generalization ability to different datasets while increasing data diversity and alleviating data scarcity.Then,a semantic feature knowledge transfer strategy is proposed to achieve ideal initialization of classifier weights and improve model convergence speed,and a localization feature knowledge transfer strategy is proposed to improve the model′s localization ability.The experimental results demonstrate that the proposed method performs well in few-shot object detection tasks and has certain competitiveness compared to existing methods.

关键词

小样本目标检测 / 微调 / 多维特征知识转移 / 聚合重构数据增强

Key words

few-shot object detection / fine-tuning / multidimensional feature knowledge transfer / aggregation-reconstruction data enhancement

引用本文

引用格式 ▾

吴家骏, 王一, 朱松豪. 基于聚合重构数据增强和多维特征知识转移的小样本目标检测[J]. 小型微型计算机系统, 2026, 47(5): 1190-1197 DOI:10.20009/j.cnki.21-1106/TP.2025-0127

登录浏览全文

4963

注册一个新账户忘记密码

参考文献

[1] Ross B Girshick.Fast R-CNN[C]//IEEE Conference on Computer Vision,2015:1440-1448.
[2] Joseph Redmon,Santosh Kumar Divvala,Ross B Greshick,et al.You only look once:unified,real-time object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition,2016:779-788.
[3] Shaoqing Ren,Kaiming He,Ross B Girshick,et al.Faster R-CNN:towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[4] Wei Liu,Dragomir Anguelov,Dumitru Erhan,et al.SSD:single shot multibox detector[C]//European Conference on Computer Vision,2016:21-37.
[5] Zheng Wang,Yingjie Gao,Qingjie Liu,et al.Semantic enhanced few-shot object detection[J].Computing Research Repository,2024,32(6):1-8.
[6] Yuantao Yin,Ping Yin.Stability plasticity decoupled fine-tuning for few-shot end-to-end object detection[J].Computing Research Repository,2024,32(1):1-6.
[7] Jie Mei,Mingyuan Jiu,Hichem Sahbi,et al.Few-shot object detection with sparse context transformers[J].Computing Research Repository,2024,32(2):1-10.
[8] Anh Khoa Nguyen Vu,Quoc Truong Truong,Vinh Tiep Nguyen,et al.Multi-perspective data augmentation for few-shot object detection[J].Computing Research Repository,2025,33(2):1-19.
[9] Yingjie Gao,Yanan Zhang,Ziyue Huang,et al.PS-TTL:prototype-based soft-labels and test-time learning for few-shot object detection[C]//ACM Conference on Multimedia,2024:8691-8700.
[10] Zeyu Shangguan,Daniel Seita,Mohammad Rostami.Cross-domain few-shot object detection with multi-modal textual enrichment[J].Computing Research Repository,2025,33(2):1-41.
[11] Ekin D Cubuk,Barret Zoph,Dandelion Mane,et al.Autoaugment:learning augmentation policies from data[J].Computing Research Repository,2018,26(5):1-14.
[12] Sungbin Lim,Ildoo Kim,Taesup Kim,et al.Fast autoaugment[C]//Annual Conference on Neural Information Processing Systems,2019:6662-6672.
[13] Terrance DeVries,Graham W Taylor.Improved regularization of convolutional neural networks with cutout[J].Computing Research Repository,2017,25(8):1-8.
[14] Hongyi Zhang,Moustapha Cisse,Yann N Dauphin,et al.MixUp:beyond empirical risk minimization[C]//International Conference on Learning Representations,2018:1-13.
[15] Alexey Bochkovskiy,Chienyao Wang,Hongyuan Mark Liao.Yolov4:optimal speed and accuracy of object detection[J].Computing Research Repository,2020,26(4):1-17.
[16] Sangdoo Yun,Dongyoon Han,Seong Joon Oh,et al.Cutmix:regularization strategy to train strong classifiers with localizable features[C]//IEEE Conference on Computer Vision,2019:6023-6032.
[17] Alec Radford,Jong Wook Kim,Chris Hallacy,et al.Learning transferable visual models from natural language supervision[C]//International Conference on Machine Learning,2021:8748-8763.
[18] Mark Everingham,Luc Van Gool,Christopher K I Wiliams,et al.The pascal visual object classes (VOC)challenge[J].International Journal of Computer Vision,2010,88(2):303-338.
[19] Tsungyi Lin,Michael Maire,Serge J Belongie,et al.Microsoft COCO:common objects in context[C]//European Conference on Computer Vision,2014:740-755.
[20] Xin Wang,Thomas E Huang,Joseph Gonzalez,et al.Frustratingly simple few-shot object detection[C]//International Conference on Machine Learning,2020:9919-9928.
[21] Jiaxi Wu,Songtao Liu,Di Huang,et al.Multi-scale positive sample refinement for few-shot object detection[C]//European Conference on Computer Vision,2020:456-472.
[22] Chenchen Zhu,Fangyi Chen,Uzair Ahmed,et al.Semantic relation reasoning for shot-stable few-shot object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition,2021:8782-8791.
[23] Aming Wu,Yahong Han,Linchao Zhu,et al.Universal-prototype enhancing for few-shot object detection[C]//IEEE Conference on Computer Vision,2021:9547-9556.
[24] Songhao Zhu,Yi Wang.Multi-level similarity transfer and adaptive fusion data augmentation for few-shot object detection[J].Visual Communication and Image Representation,2024,105(1):104-112.
[25] Jinxiang Zhu,Qi Wang,Xinyu Dong,et al.FSNA:few-shot object detection via neighborhood information adaption and all attention[J].IEEE Transactions on Circuits and Systems for Video Technology,2024,34(8):7121-7134.
[26] Jiangmeng Li,Yanan Zhang,Wenwen Qiang,et al.Disentangle and remerge:interventional knowledge distillation for few-shot object detection from a conditional causal perspective[C]//Annual Conference on Neural Information Processing Systems,2023:1323-1333.
[27] Yang Xiao,Vincent Lepetit,Renaud Marlet.Few-shot object detection and viewpoint estimation for objects in the wild[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(3):3090-3106.
[28] Xiaopeng Yan,Ziliang Chen,Anni Xu,et al.Meta R-CNN:towards general solver for instance-level low-shot learning[C]//IEEE Conference on Computer Vision,2019:9576-9585.
[29] Bowen Li,Chen Wang,Pranay Reddy,et al.AirDet:few-shot detection without fine-tuning for autonomous exploration[C]//European Conference on Computer Vision,2022:427-444.
[30] Xiaowei Zhao,Xianglong Liu,Yuqing Ma,et al.Temporal speciation network for few-shot object detection[J].IEEE Transactions on Multimedia,2023,25(1):8267-8278.
[31] Bowei Yan,Chunbo Lang,Gong Cheng,et al.Understanding negative proposals in generic few-shot object detection[J].IEEE Transactions on Circuits and Systems for Video Technology,2024,34(7):5818-5829.