PDF (1537K)
摘要
少样本语义分割旨在仅有少数支持图像注释样本情况下学习从给定类的查询图像中分割目标对象,近年来基于特征匹配的方法面对这一任务取得了显著成功。然而,多匹配关系带来更多的类间和类内噪声,为缓解高维卷积的运算压力而对权重进行稀疏化也导致了特征相关性的细粒度匹配精度损失。为了缓解上述问题,提出了双特征引导超相关性的少样本分割网络(DFGHNet)。DFGHNet采用更加高效的4D卷积核以减少权重稀疏化导致的特征细粒度匹配精度损失,同时引入双特征掩码策略和采用不含可学习参数的非局部均值特征映射模块,在学习到的匹配模式进行相关性引导。在数据集PASCAL-5 i、COCO-20 i的标准少样本分割基准测试中,该方法的最大精度提升分别为4.8%和6.5%,验证了其有效性。
Abstract
Few-shot semantic segmentation aims to learn to segment target objects from query images of a given category when only a few samples support image annotation. Recently, matching-based methods establish correlation matching of dense features by applying 4D convolution and introducing background correlation. However, multiple matching relationships bring more inter-class noise and intra-class noise, and the sparse weights to alleviate the computational pressure of high-dimensional convolution also led to the loss of fine-grained matching accuracy of feature correlation. To alleviate the above problems, a dual-feature guided hypercorrelation few-shot segmentation network (DFGHNet) is proposed. DFGHNet uses a more efficient 4D convolution kernel to reduce the loss of feature fine-grained matching accuracy caused by weight sparsification. At the same time, DFGHNet introduces a dual feature mask strategy and adopts a non-local mean feature mapping module without learnable parameters to guide the correlation in the learned matching pattern. In the standard few-shot segmentation benchmarks of PASCAL-5 i and COCO-20 i datasets, the performance of the proposed method achieves the maximum accuracy improvement of 4.8% and 6.5% respectively, verifying the effectiveness of this method.
关键词
Key words
[Author(id=1275839976919843750, tenantId=1045748351789510663, journalId=1155139928303341607, articleId=1271765512258782140, orderNo=0, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1275839976991146920, tenantId=1045748351789510663, journalId=1155139928303341607, articleId=1271765512258782140, authorId=1275839976919843750, language=EN, stringName=Kun HE, firstName=Kun, middleName=null, lastName=HE, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=School of Software, Yunnan University , Kunming 650504, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1275839977041478569, tenantId=1045748351789510663, journalId=1155139928303341607, articleId=1271765512258782140, authorId=1275839976919843750, language=CN, stringName=贺坤, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=云南大学 软件学院 , 昆明 650504, bio={"content":"贺坤,主要从事少样本语义分割方面的研究。
"}, bioImg=null, bioContent=贺坤,主要从事少样本语义分割方面的研究。
, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1275839976835957666, tenantId=1045748351789510663, journalId=1155139928303341607, articleId=1271765512258782140, xref=null, ext=[AuthorCompanyExt(id=1275839976852734883, tenantId=1045748351789510663, journalId=1155139928303341607, articleId=1271765512258782140, companyId=1275839976835957666, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Software, Yunnan University , Kunming 650504, China), AuthorCompanyExt(id=1275839976869512100, tenantId=1045748351789510663, journalId=1155139928303341607, articleId=1271765512258782140, companyId=1275839976835957666, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=云南大学 软件学院 , 昆明 650504)])]), Author(id=1275839977096004523, tenantId=1045748351789510663, journalId=1155139928303341607, articleId=1271765512258782140, orderNo=1, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1275839977163113389, tenantId=1045748351789510663, journalId=1155139928303341607, articleId=1271765512258782140, authorId=1275839977096004523, language=EN, stringName=Ying WU, firstName=Ying, middleName=null, lastName=WU, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=School of Software, Yunnan University , Kunming 650504, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1275839977217639342, tenantId=1045748351789510663, journalId=1155139928303341607, articleId=1271765512258782140, authorId=1275839977096004523, language=CN, stringName=吴颖, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=云南大学 软件学院 , 昆明 650504, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1275839976835957666, tenantId=1045748351789510663, journalId=1155139928303341607, articleId=1271765512258782140, xref=null, ext=[AuthorCompanyExt(id=1275839976852734883, tenantId=1045748351789510663, journalId=1155139928303341607, articleId=1271765512258782140, companyId=1275839976835957666, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Software, Yunnan University , Kunming 650504, China), AuthorCompanyExt(id=1275839976869512100, tenantId=1045748351789510663, journalId=1155139928303341607, articleId=1271765512258782140, companyId=1275839976835957666, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=云南大学 软件学院 , 昆明 650504)])]), Author(id=1275839977318302640, tenantId=1045748351789510663, journalId=1155139928303341607, articleId=1271765512258782140, orderNo=2, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=yuy1219@163.com, emailSecond=null, emailThird=null, correspondingAuthor=1, authorType=1, ext={EN=AuthorExt(id=1275839977385411506, tenantId=1045748351789510663, journalId=1155139928303341607, articleId=1271765512258782140, authorId=1275839977318302640, language=EN, stringName=Yong YU, firstName=Yong, middleName=null, lastName=YU, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=*, address=School of Software, Yunnan University , Kunming 650504, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1275839977439937459, tenantId=1045748351789510663, journalId=1155139928303341607, articleId=1271765512258782140, authorId=1275839977318302640, language=CN, stringName=郁湧, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=*, address=云南大学 软件学院 , 昆明 650504, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1275839976835957666, tenantId=1045748351789510663, journalId=1155139928303341607, articleId=1271765512258782140, xref=null, ext=[AuthorCompanyExt(id=1275839976852734883, tenantId=1045748351789510663, journalId=1155139928303341607, articleId=1271765512258782140, companyId=1275839976835957666, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Software, Yunnan University , Kunming 650504, China), AuthorCompanyExt(id=1275839976869512100, tenantId=1045748351789510663, journalId=1155139928303341607, articleId=1271765512258782140, companyId=1275839976835957666, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=云南大学 软件学院 , 昆明 650504)])])]
贺坤,吴颖,郁湧.
双特征引导超相关性的少样本分割网络[J].
电子科技大学学报, 2026, 55(3): 447-454 DOI:10.12178/1001-0548.2024270
| [1] |
HE K , ZHANG X , REN S , et al. Deep residual learning for image recognition[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Computer Society, 2016: 770-778.
|
| [2] |
HUANG G , LIU Z , VAN D , et al. Densely connected convolutional networks[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Computer Society, 2017: 4700-4708.
|
| [3] |
KRIZHEVSKY A , SUTSKEVER I , HINTON G E . ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90.
|
| [4] |
SUNG F , YANG Y , ZHANG L , et al. Learning to compare: Relation network for few—shot learning[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Computer Society, 2018: 1199-1208.
|
| [5] |
SUN Q , LIU Y , CHUA T S , et al. Meta—transfer learning for few—shot learning[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Computer Society, 2019: 403-412.
|
| [6] |
HOSPEDALES T , ANTONIOU A , MICAELLI P , et al. Meta—learning in neural networks: A survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 44(9): 5149-5169.
|
| [7] |
FINN C , ABBEEL P , LEVINE S . Model—agnostic meta—learning for fast adaptation of deep networks[C]// International Conference on Machine Learning. New York: PMLR, 2017: 1126-1135.
|
| [8] |
LONG J , SHELHAMER E , DARRELL T . Fully convolutional networks for semantic segmentation[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Computer Society, 2015: 3431-3440.
|
| [9] |
RONNEBERGER O , FISCHER P , BROX T . U—net: Convolutional networks for biomedical image segmentation[C]// Medical Image Computing and Computer—Assisted Intervention. Munich: Springer, 2015: 234-241.
|
| [10] |
SHABAN A , BANSAL S , LIU Z , et al. One—shot learning for semantic segmentation[C]// British Machine Vision Conference. London: British Machine Vision Association, 2017: 72.
|
| [11] |
ROCCO I , ARANDJELOVIĆ R , SIVIC J . Efficient neighbourhood consensus networks via submanifold sparse convolutions[C]// European Conference on Computer Vision. Glasgow: Springer, 2020: 605-621.
|
| [12] |
ROCCO I , CIMPOI M , ARANDJELOVIĆ R , et al. Neighbourhood consensus networks[J]. Advances in Neural Information Processing Systems, 2018, 31: 1651-1662.
|
| [13] |
SNELL J , SWERSKY K , ZEMEL R . Prototypical networks for few—shot learning[J]. Advances in Neural Information Processing Systems, 2017, 30: 4077-4087.
|
| [14] |
MIN J , LEE J , PONCE J , et al. Learning to compose hypercolumns for visual correspondence[C]// European Conference on Computer Vision. Glasgow: Springer, 2020: 346-363.
|
| [15] |
MIN J , KANG D , CHO M . Hypercorrelation squeeze for few—shot segmentation[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Computer Society, 2021: 6941-6952.
|
| [16] |
LIU H , PENG P , CHEN T , et al. Fecanet: Boosting few—shot semantic segmentation with feature—enhanced context—aware network[J]. IEEE Transactions on Multimedia, 2023, 25: 8580-8592.
|
| [17] |
WANG X , GIRSHICK R , GUPTA A , et al. Non—local neural networks[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Computer Society, 2018: 7794-7803.
|
| [18] |
TIAN Z , ZHAO H , SHU M , et al. Prior guided feature enrichment network for few—shot segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 44(2): 1050-1065.
|
| [19] |
WU Z , SHI X , LIN G , et al. Learning meta—class memory for few—shot semantic segmentation[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Computer Society, 2021: 517-526.
|
| [20] |
WU Y , HE K . Group normalization[C]// European Conference on Computer Vision. Glasgow: Springer, 2018: 3-19.
|
| [21] |
LIN T Y , MAIRE M , BELONGIE S , et al. Microsoft coco: Common objects in context[C]// European Conference on Computer Vision. Glasgow: Springer, 2014: 740-755.
|
| [22] |
LIU J , BAO Y , XIE G S , et al. Dynamic prototype convolution network for few—shot semantic segmentation[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Computer Society, 2022: 11553-11562.
|
| [23] |
BOUDIAF M , KERVADEC H , MASUD Z I , et al. Few—shot segmentation without meta—learning: A good transductive inference is all you need?[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Computer Society, 2021: 13979-13988.
|
| [24] |
LIU B , DING Y , JIAO J , et al. Anti—aliasing semantic reconstruction for few—shot semantic segmentation[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Computer Society, 2021: 9747-9756.
|
| [25] |
LIU Y , LIU N , CAO Q , et al. Learning non—target knowledge for few—shot semantic segmentation[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Computer Society, 2022: 11573-11582.
|
| [26] |
KANG D , CHO M . Integrative few—shot learning for classification and segmentation[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Computer Society, 2022: 9979-9990.
|
| [27] |
XU Q , ZHAO W , LIN G , et al. Self—calibrated cross attention network for few—shot segmentation[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Computer Society, 2023: 655-665.
|
| [28] |
CHEN H , DONG Y , LU Z , et al. Dense affinity matching for few—shot segmentation[J]. Neurocomputing, 2024, 577: 127348.
|
| [29] |
NGUYEN K , TODOROVIC S . Feature weighting and boosting for few—shot segmentation[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Computer Society, 2019: 622-631.
|