PDF (8400K)
摘要
大型工程施工过程中产生了海量的安全隐患排查记录,蕴含了多类隐患要素关联知识,对工程安全管控有重要参考意义。然而,通过人工手段进行隐患危险源信息抽取与其内在关联挖掘耗时费力,难以及时反馈现场管控。提出一种基于通用信息抽取(Universal Information Extraction, UIE)框架与改进Apriori算法的隐患危险源实体智能抽取与知识挖掘方法。首先,基于UIE框架构建危险源实体识别模型,确定实体抽取提示标签,并通过小样本微调实现高效、准确的危险源实体自动抽取;然后,提出考虑隐患数据类型约束改进Apriori算法流程,进行多要素关联规则的挖掘与可视化。实例分析表明,所提出的危险源实体抽取模型在验证集与测试集上的F1值分别达到了0.892和0.886,显著高于基础模型的0.253与0.307,在整体模型上的危险源实体识别率提高了36.66%;此外,利用桑基图和关联网络图对改进Apriori抽取的多要素强关联规则进行可视化,展示出良好的可解释性。可为大型工程的海量安全隐患文本知识挖掘提供了高效、智能化的技术手段,为施工现场针对性安全管控措施制定提供了数据支持。
Abstract
Mega projects generate a vast amount of safety hazard inspection records, which contain valuable knowledge on the relationships between various hazard elements and are essential for safety management. However, manually extracting safety hazard information and uncovering their internal correlations is time-consuming and inefficient, making it difficult to provide timely feedback for on-site safety management. An intelligent extraction and knowledge mining method was proposed for hazard source entities based on the Universal Information Extraction(UIE) framework and an improved Apriori algorithm. First, a safety hazard entity recognition model is constructed using the UIE framework, with specific entity extraction prompts defined. The model is fine-tuned with few-shot learning to achieve efficient and accurate automatic extraction of safety hazard entities. Then, an improved Apriori algorithm is introduced, considering the constraints of hazard data types, to perform multi-factor association rule mining and visualization. Case analysis shows that the proposed safety hazard entity extraction model achieved F1 scores of 0.892 and 0.886 on the validation and test datasets respectively, significantly outperforming the baseline model′s scores of 0.253 and 0.307, and the overall entity recognition rate improves 36.66%. Additionally, the extracted multi-factor strong association rules are visualized using Sankey diagrams and association network graphs, demonstrating good interpretability. Research findings provides an efficient and intelligent method for mining knowledge from the vast amount of safety hazard text data generated in mega construction projects, offering data-driven support for the development of targeted safety management measures on construction sites.
关键词
大型工程
/
安全隐患
/
通用信息抽取
/
知识挖掘
/
自然语言处理
Key words
mega project
/
safety hazards
/
universal information extraction(UIE)
/
knowledge mining
/
natural language processing(NLP)
Author summay
[Author(id=1248675998267462015, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, orderNo=0, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=liu_guoping@ctg.com.cn, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1248675998330376577, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, authorId=1248675998267462015, language=EN, stringName=Guoping LIU, firstName=Guoping, middleName=null, lastName=LIU, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1. Yangtze Three Gorges Technology and Economy Development Co., Ltd., Beijing 101100, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1248675998376513922, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, authorId=1248675998267462015, language=CN, stringName=刘国平, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1.长江三峡经济技术发展有限公司, 北京 101100, bio={"content":"刘国平(1970—), 男, 正高级工程师, 硕士, 主要从事工程建设管理工作。E-mail: liu_guoping@ctg.com.cn
"}, bioImg=null, bioContent=刘国平(1970—), 男, 正高级工程师, 硕士, 主要从事工程建设管理工作。E-mail: liu_guoping@ctg.com.cn
, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1248675998116467064, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, xref=null, ext=[AuthorCompanyExt(id=1248675998133244281, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, companyId=1248675998116467064, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1. Yangtze Three Gorges Technology and Economy Development Co., Ltd., Beijing 101100, China), AuthorCompanyExt(id=1248675998145827194, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, companyId=1248675998116467064, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1.长江三峡经济技术发展有限公司, 北京 101100)])]), Author(id=1248675998422651269, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, orderNo=1, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1248675998481371530, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, authorId=1248675998422651269, language=EN, stringName=Xin LI, firstName=Xin, middleName=null, lastName=LI, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=2, address=2. State Key Laboratory of Hydraulic Engineering Intelligent Construction and Operation, Tianjin University, Tianjin 300350, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1248675998527508877, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, authorId=1248675998422651269, language=CN, stringName=李欣, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=2, address=2.天津大学 水利工程智能建设与运维全国重点实验室, 天津 300350, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1248675998191964539, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, xref=null, ext=[AuthorCompanyExt(id=1248675998208741756, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, companyId=1248675998191964539, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2. State Key Laboratory of Hydraulic Engineering Intelligent Construction and Operation, Tianjin University, Tianjin 300350, China), AuthorCompanyExt(id=1248675998221324669, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, companyId=1248675998191964539, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2.天津大学 水利工程智能建设与运维全国重点实验室, 天津 300350)])]), Author(id=1248675998573646224, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, orderNo=2, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=liudh@tju.edu.cn, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1248675998636560788, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, authorId=1248675998573646224, language=EN, stringName=Donghai LIU, firstName=Donghai, middleName=null, lastName=LIU, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=2, address=2. State Key Laboratory of Hydraulic Engineering Intelligent Construction and Operation, Tianjin University, Tianjin 300350, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1248675998682698135, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, authorId=1248675998573646224, language=CN, stringName=刘东海, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=2, address=2.天津大学 水利工程智能建设与运维全国重点实验室, 天津 300350, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1248675998191964539, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, xref=null, ext=[AuthorCompanyExt(id=1248675998208741756, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, companyId=1248675998191964539, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2. State Key Laboratory of Hydraulic Engineering Intelligent Construction and Operation, Tianjin University, Tianjin 300350, China), AuthorCompanyExt(id=1248675998221324669, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, companyId=1248675998191964539, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2.天津大学 水利工程智能建设与运维全国重点实验室, 天津 300350)])]), Author(id=1248675998724641178, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, orderNo=3, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1248675998787555740, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, authorId=1248675998724641178, language=EN, stringName=Shijie ZHOU, firstName=Shijie, middleName=null, lastName=ZHOU, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=2, address=2. State Key Laboratory of Hydraulic Engineering Intelligent Construction and Operation, Tianjin University, Tianjin 300350, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1248675998829498781, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, authorId=1248675998724641178, language=CN, stringName=周诗杰, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=2, address=2.天津大学 水利工程智能建设与运维全国重点实验室, 天津 300350, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1248675998191964539, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, xref=null, ext=[AuthorCompanyExt(id=1248675998208741756, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, companyId=1248675998191964539, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2. State Key Laboratory of Hydraulic Engineering Intelligent Construction and Operation, Tianjin University, Tianjin 300350, China), AuthorCompanyExt(id=1248675998221324669, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, companyId=1248675998191964539, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2.天津大学 水利工程智能建设与运维全国重点实验室, 天津 300350)])]), Author(id=1248675998875636128, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, orderNo=4, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1248675998938550694, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, authorId=1248675998875636128, language=EN, stringName=Hongyan WU, firstName=Hongyan, middleName=null, lastName=WU, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1. Yangtze Three Gorges Technology and Economy Development Co., Ltd., Beijing 101100, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1248675998988882344, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, authorId=1248675998875636128, language=CN, stringName=吴红艳, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1.长江三峡经济技术发展有限公司, 北京 101100, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1248675998116467064, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, xref=null, ext=[AuthorCompanyExt(id=1248675998133244281, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, companyId=1248675998116467064, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1. Yangtze Three Gorges Technology and Economy Development Co., Ltd., Beijing 101100, China), AuthorCompanyExt(id=1248675998145827194, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596695047336330, companyId=1248675998116467064, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1.长江三峡经济技术发展有限公司, 北京 101100)])])]
刘国平,李欣,刘东海,周诗杰,吴红艳.
基于UIE与改进Apriori的大型工程隐患危险源抽取与知识挖掘方法[J].
水利水电技术(中英文), 2025, 56(S1): 102-110 DOI:10.13928/j.cnki.wrahe.2025.S1.016
基金资助
中国长江三峡集团有限公司企业科研项目(202103551)