Human-Computer Intelligence Research Center,Shenyang Aerospace University,Shenyang 110136,China
Show less
文章历史+
Received
Accepted
Published
2022-04-28
Issue Date
2025-08-05
PDF (1223K)
摘要
多跳阅读理解是自然语言处理研究领域的热点和难点,其研究在文本理解、自动问答、对话系统等方面具有重要意义和广泛应用。针对当前面向中文的多跳阅读理解(Multi-Hop Reading Comprehension,MHRC)研究不足的现状,构建了一个面向复杂问题的中文多跳阅读理解(Complex Chinese Machine Reading Comprehension,Complex CMRC)数据集,提出了一种基于问题分解的中文MHRC方法。该方法分为问题分解和问题求解两个阶段:首先提出了一种融合JointBERT模型和规则的复杂问题分解方法,通过JointBERT模型对问题类型识别和问题片段识别联合建模,获得准确的问题类型和问题片段信息,再利用专门设计的问题分解规则将复杂问题分解为多个简单子问题;然后采用BERT预训练模型对所有子问题进行迭代求解,最终获得复杂问题的答案。分别在Complex CMRC数据集上进行问题分解和问题求解实验,取得了良好的实验结果,验证了提出方法的有效性。
Abstract
Multi-Hop Reading Comprehension (MHRC) is a hot and difficult task in the field of natural language processing,and its research is importantly and widely used in text understanding,automatic question answering,and dialogue systems.To address the current lack of research on Chinese-oriented MHRC,a Chinese MHRC dataset for complex question was constructed and a Chinese MHRC method based on question decomposition was proposed.The method was divided into two stages:Firstly,a complex question decomposition method integrating JointBERT model and its rules was proposed to jointly model the question type identification and the question fragment identification by JointBERT model to obtain accurate question type and question fragment information,and then the specially designed question decomposition rules were used to decompose the complex question into multiple simple sub-questions.Secondly,the BERT pre-training model was utilized to iteratively solve all the sub-questions and finally obtain the answer of the complex question.The question decomposition and question solving experiments were conducted on the Complex CMRC dataset respectively which verify the effectiveness of the proposed method.
JointBERT[11]模型在口语语言理解(Spoken Language Understanding)任务中被提出,针对意图识别和槽填充的联合任务,采用基于预训练模型BERT的联合建模方法,取得非常好的效果。考虑问题类型和问题片段的识别任务具有很强的相关性,适合于联合建模,因此,本文采用了基于JointBERT的联合识别方法。
2.3.1 识别方法
本文采用改进的基于JointBERT的问题类型和问题片段的联合识别模型结构,如图2所示。JointBERT模型结构上就是BERT模型,但在训练方式上采用了联合建模。BERT的模型架构是基于原始Transformer模型(Tm)的多层双向Transformer编码器。为了进一步优化输出序列,本文又在最后输出端加入了条件随机场[12]CRF(Conditional Random Field)层。图中以Bridge类型问句“属于细小的雀形目鸟类的动物分布在哪些地区?”为例。
BERT是基准模型,本实验中选择BERT-Base-Chinese为基准测试模型,预训练任务为掩码语言模型(Masked Language Model,MLM)和下一句子预测(Next Sentence Prediction,NSP)。其他模型都是在此基础上得到,例如:wwm表示采用了全词掩盖代替字掩盖;ext表示扩展了训练语料库中文维基百科的语料,加入了其他百科、新闻、问答等语料数据;MacBERT和Roberta都对模型的训练进行了一些改变。
第i个子问题的联合F1值是该子问题的联合准确率和联合召回率的调和平均数。将第i个子问题的答案准确率、答案召回率、支持事实准确率、支持事实召回率分别记为ANS Precision i 、ANS Recall i 、Sup Precision i 、Sup Recall i,具体公式如式(15)和(16)所示
Joint Precision i =ANS Precision i ·Sup Precision i
YangZ, QiP, ZhangS Z,et al.HotpotQA:a dataset for diverse,explainable multi⁃hop question answering[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.Stroudsburg,USA:Association for Computational Linguistics,2018:2369-2380.
[2]
MinS, ZhongV, ZettlemoyerL,et al.Multi-hop reading comprehension through question decomposition and rescoring[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Stroudsburg,USA:Association for Computational Linguistics,2019:6097-6109.
[3]
MinS, WallaceE, SinghS,et al.Compositional questions do not necessitate multi⁃hop reasoning[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Stroudsburg,USA:Association for Computational Linguistics,2019:4249-4257.
[4]
PerezE, LewisP, YihW T,et al.Unsupervised question decomposition for question answering[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).Stroudsburg,USA:Association for Computational Linguistics,2020:8864-8880.
[5]
HassonM, BerantJ.Question decomposition with dependency graphs[EB/OL].(2021-04-17)[2021-10-23].
[6]
ZhangH Y, CaiJ J, XuJ J,et al.Complex question decomposition for semantic parsing[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Stroudsburg,USA:Association for Computational Linguistics,2019:4477-4486.
[7]
GaoY J, HuangT H, PassonneauR J.ABCD:a graph framework to convert complex sentences to a covering set of simple sentences[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.Stroudsburg,USA:Association for Computational Linguistics,2021:3919-3931.
[8]
QiP, LinX W, MehrL,et al.Answering complex open-domain questions through iterative query generation[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).Stroudsburg,USA:Association for Computational Linguistics,2019:2590-2602.
[9]
CuiY M, LiuT, CheW X,et al.A span⁃extraction dataset for Chinese machine reading comprehension[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).Stroudsburg,USA:Association for Computational Linguistics,2019:5882-5888.
[10]
WolfsonT, GevaM, GuptaA,et al.Break it down:a question understanding benchmark[J].Transactions of the Association for Computational Linguistics,2020,8:183-198.
[11]
ChenQ, ZhuoZ, WangW.BERT for joint intent classification and slot filling[EB/OL].(2019-02-28)[2021-05-28].
[12]
McCallumA.Efficiently inducing features of conditional random fields[C]//Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence.New York,USA:ACM,2002:403-410.
[13]
CuiY M, CheW X, LiuT,et al.Pre-training with whole word masking for Chinese BERT[J].Institute of Electrical and Electronics Engineers,2021(29):3504-3514.
[14]
CuiY M, CheW X, LiuT,et al.Revisiting pre-trained models for Chinese natural language processing[EB/OL].(2020-04-29)[2021-10-20].