PDF (3113K)
摘要
基于预训练语言模型的法律问答系统无法灵活理解用户的意图,且缺乏外部知识的整合,难以达到预期的效果。为此,该文提出了基于法条库的细粒度刑法问答数据集(Fine-grained Criminal Law Question Answering,FCL-QA),并基于该数据集,提出了基于大语言模型的法条检索增强问答框架(Statutory Articles Retrieval Augmented Question Answering Framework,SaRAF)。其核心思想是通过多等级分类定位到问题所属的主题,通过主题缩小法条的范围便于进行检索,并最终利用大语言模型生成答案。实验结果表明,SaRAF框架优于无法条生成与传统检索增强生成(Retrieval-augmented Generation,RAG)的方法,在FLC-QA数据集上取得了42.27%的ROUGEL F1分数、27.78%的BLEU-4分数和72.52%的BERTScore分数。
Abstract
Pre-trained language model-based legal question answering systems struggle to flexibly understand users' intent and lack the integration of external knowledge, making it difficult to achieve desired results. To address this, this paper proposes a fine-grained legal question answering dataset based on the criminal law articles library (FCL-QA). Based on FCL-QA, this paper proposes a Statutory Articles Retrieval Augmented Question Answering Framework (SaRAF) based on large language model. The core idea is to locate the category of the question through multi-level classification, narrow the scope of statutory articles through the category to facilitate retrieval, and finally generate the answer using a large language model. Experimental results show that the SaRAF outperforms both without statutory articles generation method and Retrieval-augmented Generation(RAG) method, achieving ROUGE-L F1 score of 42.27%, BLEU-4 score of 27.78% and BERTScore of 72.52% on the FCL-QA dataset.
Graphical abstract
关键词
大语言模型
/
问答系统
/
检索增强
Key words
large language model
/
question answering system
/
retrieval augmented
[Author(id=1183015276289053086, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, orderNo=0, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=2727782303@qq.com, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1183015276347773346, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, authorId=1183015276289053086, language=EN, stringName=Mingda LI, firstName=Mingda, middleName=null, lastName=LI, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1.School of Computer Science and Technology, Dalian University of Technology, Dalian 116000, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1183015276402299299, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, authorId=1183015276289053086, language=CN, stringName=李明达, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1.大连理工大学 计算机科学与技术学院,辽宁 大连 116000, bio={"content":"李明达(2000-),男,河南许昌人,硕士,研究方向为自然语言处理,问答系统。E-mail:2727782303@qq.com
"}, bioImg=null, bioContent=李明达(2000-),男,河南许昌人,硕士,研究方向为自然语言处理,问答系统。E-mail:2727782303@qq.com
, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1183015276070949258, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, xref=1., ext=[AuthorCompanyExt(id=1183015276083532173, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, companyId=1183015276070949258, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1.School of Computer Science and Technology, Dalian University of Technology, Dalian 116000, China), AuthorCompanyExt(id=1183015276096115087, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, companyId=1183015276070949258, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1.大连理工大学 计算机科学与技术学院,辽宁 大连 116000)])]), Author(id=1183015276477796776, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, orderNo=1, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1183015276536517036, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, authorId=1183015276477796776, language=EN, stringName=Hongbo DI, firstName=Hongbo, middleName=null, lastName=DI, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=2, address=2.Cybersecurity Protection Detachment, Dalian Municipal Public Security Bureau, Dalian 116000, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1183015276582654382, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, authorId=1183015276477796776, language=CN, stringName=邸洪波, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=2, address=2.大连市公安局网安支队,辽宁 大连 116000, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1183015276150641043, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, xref=2., ext=[AuthorCompanyExt(id=1183015276163223957, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, companyId=1183015276150641043, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2.Cybersecurity Protection Detachment, Dalian Municipal Public Security Bureau, Dalian 116000, China), AuthorCompanyExt(id=1183015276175806870, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, companyId=1183015276150641043, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2.大连市公安局网安支队,辽宁 大连 116000)])]), Author(id=1183015276653957555, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, orderNo=2, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=syuan@dlut.edu.cn, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1183015276737843639, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, authorId=1183015276653957555, language=EN, stringName=Yuanyuan SUN, firstName=Yuanyuan, middleName=null, lastName=SUN, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1.School of Computer Science and Technology, Dalian University of Technology, Dalian 116000, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1183015276788175290, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, authorId=1183015276653957555, language=CN, stringName=孙媛媛, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1.大连理工大学 计算机科学与技术学院,辽宁 大连 116000, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1183015276070949258, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, xref=1., ext=[AuthorCompanyExt(id=1183015276083532173, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, companyId=1183015276070949258, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1.School of Computer Science and Technology, Dalian University of Technology, Dalian 116000, China), AuthorCompanyExt(id=1183015276096115087, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, companyId=1183015276070949258, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1.大连理工大学 计算机科学与技术学院,辽宁 大连 116000)])]), Author(id=1183015276855284156, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, orderNo=3, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1183015276926587329, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, authorId=1183015276855284156, language=EN, stringName=Yanhua WANG, firstName=Yanhua, middleName=null, lastName=WANG, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=3, address=3.Chinese People's Liberation Army Air Force Communications NCO Academy, Dalian 116000, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1183015276989501891, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, authorId=1183015276855284156, language=CN, stringName=王艳华, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=3, address=3.中国人民解放军空军通信士官学校,辽宁 大连 116000, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1183015276217749913, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, xref=3., ext=[AuthorCompanyExt(id=1183015276230332826, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, companyId=1183015276217749913, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=3.Chinese People's Liberation Army Air Force Communications NCO Academy, Dalian 116000, China), AuthorCompanyExt(id=1183015276242915740, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, companyId=1183015276217749913, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=3.中国人民解放军空军通信士官学校,辽宁 大连 116000)])]), Author(id=1183015277060805065, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, orderNo=4, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1183015277123719632, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, authorId=1183015277060805065, language=EN, stringName=Zhihao YANG, firstName=Zhihao, middleName=null, lastName=YANG, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1.School of Computer Science and Technology, Dalian University of Technology, Dalian 116000, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1183015277186634197, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, authorId=1183015277060805065, language=CN, stringName=杨志豪, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1.大连理工大学 计算机科学与技术学院,辽宁 大连 116000, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1183015276070949258, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, xref=1., ext=[AuthorCompanyExt(id=1183015276083532173, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, companyId=1183015276070949258, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1.School of Computer Science and Technology, Dalian University of Technology, Dalian 116000, China), AuthorCompanyExt(id=1183015276096115087, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, companyId=1183015276070949258, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1.大连理工大学 计算机科学与技术学院,辽宁 大连 116000)])]), Author(id=1183015277291491805, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, orderNo=5, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1183015277366989281, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, authorId=1183015277291491805, language=EN, stringName=Hongfei LIN, firstName=Hongfei, middleName=null, lastName=LIN, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1.School of Computer Science and Technology, Dalian University of Technology, Dalian 116000, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1183015277476041191, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, authorId=1183015277291491805, language=CN, stringName=林鸿飞, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1.大连理工大学 计算机科学与技术学院,辽宁 大连 116000, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1183015276070949258, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, xref=1., ext=[AuthorCompanyExt(id=1183015276083532173, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, companyId=1183015276070949258, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1.School of Computer Science and Technology, Dalian University of Technology, Dalian 116000, China), AuthorCompanyExt(id=1183015276096115087, tenantId=1045748351789510663, journalId=1155139928303341656, articleId=1183015273357234474, companyId=1183015276070949258, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1.大连理工大学 计算机科学与技术学院,辽宁 大连 116000)])])]
李明达,邸洪波,孙媛媛,王艳华,杨志豪,林鸿飞.
基于法条检索的生成式法律问答研究[J].
山西大学学报(自然科学版), 2025, 48(04): 653-665 DOI:10.13451/j.sxu.ns.2024159
基金资助
国家重点研发计划项目(2022YFC3301801)