In the semantic similarity model based on contrastive learning, insufficient information exchange between sentences with different positive examples and the scarcity of hard negative examples in traditional negative sample collection strategies make it difficult to capture the subtle feature differences between sentences, thereby inability to accurately capture the similarity between texts. This article proposes a semantic similarity method based on augmented positives and interlayer negative. By designing a dynamic neighborhood mechanism to fuse information between different positive examples and proposing a method for generating difficult negative examples, the correlation of semantic similarity judgment is significantly improved. Firstly, retrieve sentence embeddings with similar semantic features to positive examples from dynamic neighborhoods, concatenate them with positive examples, and then obtain augmented positive examples through self attention aggregation, thereby fusing information from different positive examples. Secondly, a method for generating difficult negative examples is proposed, which takes the sentence representation in the middle layer of the model as the original positive example of difficult negative examples, and intraduces cross entropy loss as punishment to improve the negative example sampling strategy. The experimental results show that in the semantic similarity task dataset STS2012~STS2016, STS-B, SICK-R, the method proposed in this paper has a significant effect, with Spearman correlation coefficients increasing by an average of 1.09 and 0.34 percentage points compared to advanced models on the basis of BERT-base and BERT-large, respectively.
针对以上问题,受文献[14-16]的启发,本文提出了一种基于增强正例与层间负例的语义相似性模型(Semantic similarity model based on augmented positives and interlayer negatives,APINCSE),通过构建动态邻域得到增强正例、基于层间编码生成困难负样本,获得更好的句子表示,显著提高了语义相关的准确性。
CerD, DiabM, AgirreE, et al. SemEval-2017 task 1: semantic textual similarity multilingual and crosslingual focused evaluation[C]∥Proceedings of the 11th International Workshop on Semantic Evaluation. Stroudsburg, PA: ACL, 2017: 1-14.
[2]
RadfordA, NarasimharK. Improving language understanding by GenerativePre-Training[EB/OL].(2018-06-11)[2023-12-11].
[3]
DevlinJ, ChangM W, LeeK, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C] ∥Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2019: 4171-4186.
[4]
LiuY H, OttM, GoyalN, et al. RoBERTa: a robustly optimized BERT pretrainingapproach[EB/OL]. (2019-07-26)[2023-12-11].
[5]
YangZ L, DaiZ H, YangY M, et al. XLNet: generalized autogressive pretraining for language understanding[C] ∥Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver, Canada: NeurIPS, 2019: 5753-5763.
[6]
LiB H, ZhouH, et al. On the sentenceembeddings from pre-trained language models[C]∥Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2020:9119-9130.
[7]
ReimersN, Gureuych, I. Sentence-BERT: sentence embeddings using siamese BERT-networks[C]∥Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2019: 3982-3992.
[8]
GaoT Y, YaoX C, ChenD Q. SimCSE: simple contrastive learning of sentence embeddings[C]∥Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2021: 6894-6910.
[9]
WuX, GaoC C, ZangL J, et al. ESimCSE: enhanced sample building method for contrastive learning of unsupervised sentence embedding[C]∥Proceedings of the 29th International Conference on Computational Linguistics. New York: ACM Press,2022:3898-3907.
[10]
ZhangY H, ZhuH J, WangY L, et al. A contrastive framework for learning sentence representations from pairwise and triple-wise perspective in angular space[C]∥Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: CAL, 2022: 4892–4903.
[11]
ChuangY S, DangovskiR, LuoH Y, et al. DiffCSE: difference-based contrastive learning for sentence embeddings[C]∥Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2023: 4207-4218.
[12]
LiuJ D, LiuJ H, WangQ F, et al. RankCSE: unsupervised sentence representations learning via learning to rank[C]∥Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2023: 13785-13802.
[13]
WangH, LiY G, HuangZ, et al. SNCSE: contrastive learning for unsupervised sentence embedding with soft negative samples[C]∥International Conference on Intelligent Computing. New York, USA: ICIC, 2023: 419-431.
[14]
HeH L, ZhangJ L, LanZ Z, et al.Instance smoothed contrastive learning for unsupervised sentence embedding[C]∥Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence. Washington, DC: AAAI Press, 2023: 12863-12871.
[15]
RobinsonJ, ChuangC Y, SraS, et al. Contrastive learning with hard negative samples[C]∥9th International Conference on Learning Representations. Virtual, 2021: joshr17.
[16]
WuX, GaoC C, SuY P, et al.Smoothed contrastive learning for unsupervised sentence embedding[C]∥Proceedings of the 29th International Conference on Computational Linguistics. New York, USA: ICCL, 2022: 4902-4906.
[17]
KimT, YooK M, LeeS G. Self-guided contrastive learning for BERT sentence representations[C]∥Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Stroudsburg, PA: ACL, 2021: 2528-2540.
[18]
OhD, KimY J, LeeH D, et al.Don't judge a language model by its last layer: contrastive learning with layer-wise attention pooling[C]∥Proceedings of the 29th International Conference on Computational Linguistics. New York, USA: ICCL, 2022: 4585-4592.
[19]
DengJ H, WanF Q, YangT, et al. Clustering-aware negative sampling for unsupervised sentence representation[C] ∥Findings of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2023: 8713-8729.