属性抽取的关键技术在逐渐发展。早期主要采用基于规则的属性抽取,但随后逐渐减少,研究重心转移到基于神经网络的深度学习技术。相比基于卷积的语义编码方法,属性抽取研究更倾向于采用基于序列的循环计算编码技术,其中以集成了长短期记忆单元(Long-term Short Memory,LSTM)[7]的循环神经网络最为常见。这是因为属性抽取任务的处理对象是自然语言中的词语,对任意属性判别都强烈依赖于上下文信息。LSTM能够在一定程度上缓解长距离依赖问题,即通过融合上下文语义信息来改善记忆处理,因此成为近期研究中不可或缺的技术之一。
条件随机场(Conditional Random Field, CRF)[8],作为一种基于动态规划的选择方法,因其在序列标注场景中的天然适用性而被广泛应用。特别是,在进行最优化判别过程中,CRF依赖于之前时刻的最优解,这种特性使其与LSTM具有较高的兼容性。例如,2019年提出的前沿模型面向目标的意见词提取模型(Target-oriented Opinion Words Extraction,TOWE)[9]和结合规则的神经网络属性抽取模型(Rule Incorporated Neural Aspect and Opinion Term Extraction,RINANTE)[10]便是结合了CRF和LSTM进行建模的典型案例。
此外,注意力机制(attention)[11]也是属性抽取领域频繁采用的一项关键技术,尤其是在识别和加权上下文中的高关注度信息方面发挥着重要作用。值得注意的是,集成了多层多头注意力机制的Transformer[11]架构可以视为预训练语言模型在属性抽取任务中成功应用的一个例证,其中最为突出的就是基于Transformer算法的双向编码表征模型(Bidirectional Encoder Representations from Transformers,BERT)[12]预训练语言模型,它使用双向掩码(MASK)机制来保证了语义前后的通顺性。属性抽取模型目前主要有两种成功范式,一种是以BiLSTM-CRF[13]为代表的序列标注模型,另一种是以基于Transformer算法的双向编码表征模型-机器阅读理解(Bidirectional Encoder Representations from Transformers-machine Reading Comprehension,BERT-MRC)[14]为代表的阅读理解模型。目前大都以序列标注模型进行人物属性抽取[15]。
近年来,由于深度学习模型强大的学习能力,研究者们逐渐采用深度学习模型自动学习出特征代替手动设计特征模板。例如LSTM[20]和门控网络(Gated Recurrent Unit, GRU)[21]。但是这些方案仍需要大量的标注数据用于训练模型,事实上,大部分自然语言处理任务都受限于标注数据的大小。受此限制,一种基于迁移学习的训练模式更受欢迎,该方法在其他的大量数据集中训练自然语言处理(Natural Language Processing,NLP)的简单任务,然后在新的任务中实现参数共享,训练时进行微调,使得新任务不再是从头开始训练。这类模型称为预训练模型。例如生成式预训练模型(Generative Pretrained Transformer,GPT)[22]、ELMO(Embeddings from Language Models)[23],以BERT[12]为基础的预训练模型效果最佳。
XUQ T, HONGY, PANY C, et al. Survey on Aspect Term Extraction[J]. J Softw, 2023, 34: 690-711. DOI: 10.13328/j.cnki.jos.006709 .
[3]
EMBARV, KANA, SISMANB, et al. DiffXtract: Joint Discriminative Product Attribute-value Extraction[C]//2021 IEEE International Conference on Big Knowledge (ICBK). New York: IEEE, 2021: 271-280. DOI: 10.1109/ICKG52313.2021.00044 .
[4]
李昊迪. 医学领域知识抽取方法研究[D]. 哈尔滨: 哈尔滨工业大学, 2018.
[5]
LIH D. Research on Medical Domain Knowledge Extraction Methods[D]. Harbin: Harbin Institute of Technology, 2018.
[6]
FANZ F, WUZ, DAIX Y, et al. Target-oriented Opinion Words Extraction with Target-fused Neural Sequence Labeling[C]//Proceedings of the 2019 Conference of the North. Stroudsburg, PA, USA: Association for Computational Linguistics, 2019: 2509-2519. DOI: 10.18653/v1/n19-1259 .
[7]
HUM Q, LIUB. Mining and Summarizing Customer Reviews[C]//Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. New York: ACM, 2004: 168-177. DOI: 10.1145/1014052.1014073 .
[8]
李红亮. 基于规则的百科人物属性抽取算法的研究[D]. 成都: 西南交通大学, 2013.
[9]
LIH L. Research on Character Attributes Extraction Based on Rules from Baidu Encyclopedia[D]. Chengdu: Southwest Jiaotong University, 2013.
JOHNL, ANDREWM, FERNANDOP. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data[C]//The 18th International Conference on Machine Learning. Williamstown, Massachusetts, USA: Morgan Kanfmann Publishers Inc. 2001: 282-289. DOI: 20.500.14332/6188 .
[12]
FANZ F, WUZ, DAIX Y, et al. Target-oriented Opinion Words Extraction with Target-fused Neural Sequence Labeling[C]//Proceedings of the 2019 Conference of the North. Stroudsburg, PA, USA: Association for Computational Linguistics, 2019: 2509-2518. DOI: 10.18653/v1/n19-1259 .
[13]
DAIH L, SONGY Q. Neural Aspect and Opinion Term Extraction with Mined Rules as Weak Supervision[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2019: 5268-5277. DOI: 10.18653/v1/p19-1520 .
[14]
VASWANIA, SHAZEERN M, PARMARN, et al. Attention is All You Need[EB/OL]. (2017-06-12) [2025-04-21].
[15]
DEVLINJ, CHANGM W, LEEK, et al. BERT: Pre-training of Deep BidirectionalTransformers for Language Understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2018. 4171-4186.DOI:10.18653/v1/N19-1423 .
ZHANGQ, XIONGJ H, CHENGX Q. Person Attributes Extraction Based on a Weakly Supervised Learning Method[J]. J Shanxi Univ Nat Sci Ed, 2015, 38(1): 8-15. DOI: 10.13451/j.cnki.shanxi.univ(nat.sci.).2015.01.002 .
[22]
ANGELIG, TIBSHIRANIJ, WUJ, et al. Combining Distant and Partial Supervision for Relation Extraction[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, PA, USA: Association for Computational Linguistics, 2014: 1556-1567. DOI: 10.3115/v1/d14-1164 .
SUF L, XIEQ H, QIUJ Y, et al. Study on Word Clusting for Attribute Extraction Based on Deep Learning[J]. Microcomput Appl, 2016, 35(1): 53-55. DOI: 10.19358/j.issn.1674-7720.2016.01.017 .
[25]
向晓雯. 基于条件随机场的中文命名实体识别[D]. 厦门: 厦门大学, 2006.
[26]
XIANGX W. Chinese Named Entity Recognition Based on Conditional Random Fields[D]. Xiamen: Xiamen University, 2006.
[27]
KATIYARA, CARDIEC. Investigating LSTMS for Joint Extraction of Opinion Entities and Relations[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA, USA: Association for Computational Linguistics, 2016: 919-929. DOI: 10.18653/v1/p16-1087 .
[28]
CHOK, VAN MERRIENBOERB, BAHDANAUD, et al. On the Properties of Neural Machine Translation: Encoder-Decoder Approaches[C]//Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation. Stroudsburg, PA, USA: Association for Computational Linguistics, 2014: 103-111. DOI: 10.3115/v1/w14-4012 .
[29]
RADFORDA, NARASIMHANK, SALIMANST, et al.Improving Language Understanding by Generative Pre-Training [J]. Open Access Library Journal, 2021, 8: 7.
[30]
PETERSM E, NEUMANNM, IYYERM, et al. Deep Contextualized Word Representations[EB/OL]. (2018-02-14) [2025-04-21].