PDF (12777K)
摘要
在混凝土坝建设过程中,产生了大量以非结构化文本表达的材料信息,对工程质量检测与材料进一步研发具有重要意义。受数据管理技术限制,存在大量以图片形式存储的材料文本数据,难以直接编辑与利用,无法满足混凝土材料数据智能分析与管理的需求。此外,针对海量的材料文本数据,目前缺乏智能的信息提取机制,难以高效获取文本中的关键信息。因此,提出了基于图片数据的混凝土材料文本智能解译方法,识别图片数据中的文本信息,提高了倾斜材料文本的检测与识别效率。以解译的图片数据为基础,从多角度文本特征关系出发,以MMR算法为框架,结合BERT模型以及TF-IDF算法,考虑文本语义与专业术语的重要性,建立了一套混凝土材料文本智能分析技术,提取混凝土材料文本中的关键信息。以实际混凝土材料文本为基础,该方法提取关键词的准确率为86.67%,优于其他常用的关键词提取模型。研究成果为混凝土材料不可编辑文本数据的处理提供了一种新的方法,有助于提升混凝土材料数据智能化管理水平。
Abstract
During the construction of concrete dams, a large amount of material information expressed in unstructured text is generated, which is of great significance for engineering quality inspection and further research and development of materials. Due to the limitations of data management technology, there is a large amount of material text data stored in the form of images, which is difficult to directly edit and utilize, and cannot meet the needs of intelligent analysis and management of concrete dam material data. In addition, there is currently a lack of intelligent information extraction mechanisms for massive material text data, making it difficult to efficiently obtain key information from the text. An intelligent interpretation method was proposed for concrete material text based on image data, which identifies text information in image data and improves the detection and recognition efficiency of inclined material text. Based on the interpreted image data, starting from the multi perspective text feature relationship, using MMR algorithm as the framework, combined with BERT model and TF-IDF algorithm, considering the importance of text semantics and professional terminology, a set of intelligent analysis technology for concrete material text was established to extract key information from concrete material text. Based on actual concrete material text, the accuracy of extracting keywords using this method is 86.67%, which is superior to other commonly used keyword extraction models. Research findings provide a new method for processing non editable text data of concrete materials, which helps to improve the intelligent management level of concrete dam material data.
关键词
混凝土坝
/
材料数据
/
文本检测
/
智能识别
/
关键信息
Key words
concrete dam
/
material data
/
text detection
/
intelligent recognition
/
key information
Author summay
[Author(id=1248676019184476941, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, orderNo=0, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=deng_xufang@ctg.com.cn, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1248676019251585810, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, authorId=1248676019184476941, language=EN, stringName=Xufang DENG, firstName=Xufang, middleName=null, lastName=DENG, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1. China Yangtze Power Co., Ltd., Wuhan 430014, Hubei, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1248676019301917460, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, authorId=1248676019184476941, language=CN, stringName=邓旭方, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1.中国长江电力股份有限公司, 湖北 武汉 430014, bio={"content":"邓旭方(1987—), 男, 工程师, 水工金属结构主任师, 学士, 主要从事水工建筑物运维和大坝安全管理领域的技术研究工作。Email: deng_xufang@ctg.com.cn
"}, bioImg=null, bioContent=邓旭方(1987—), 男, 工程师, 水工金属结构主任师, 学士, 主要从事水工建筑物运维和大坝安全管理领域的技术研究工作。Email: deng_xufang@ctg.com.cn
, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1248676019037676290, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, xref=null, ext=[AuthorCompanyExt(id=1248676019050259203, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, companyId=1248676019037676290, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1. China Yangtze Power Co., Ltd., Wuhan 430014, Hubei, China), AuthorCompanyExt(id=1248676019067036421, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, companyId=1248676019037676290, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1.中国长江电力股份有限公司, 湖北 武汉 430014)])]), Author(id=1248676019348054807, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, orderNo=1, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=liuleping@tju.edu.cn, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1248676019406775068, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, authorId=1248676019348054807, language=EN, stringName=Leping LIU, firstName=Leping, middleName=null, lastName=LIU, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=2, address=2. State Key Laboratory of Hydraulic Engineering Intelligent Construction and Operation, Tianjin University, Tianjin 300350, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1248676019452912415, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, authorId=1248676019348054807, language=CN, stringName=刘乐平, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=2, address=2.天津大学 水利工程智能建设与运维全国重点实验室, 天津 300350, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1248676019108979464, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, xref=null, ext=[AuthorCompanyExt(id=1248676019125756681, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, companyId=1248676019108979464, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2. State Key Laboratory of Hydraulic Engineering Intelligent Construction and Operation, Tianjin University, Tianjin 300350, China), AuthorCompanyExt(id=1248676019138339594, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, companyId=1248676019108979464, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2.天津大学 水利工程智能建设与运维全国重点实验室, 天津 300350)])]), Author(id=1248676019499049763, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, orderNo=2, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1248676019557770025, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, authorId=1248676019499049763, language=EN, stringName=Zhenghu CHEN, firstName=Zhenghu, middleName=null, lastName=CHEN, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1. China Yangtze Power Co., Ltd., Wuhan 430014, Hubei, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1248676019603907372, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, authorId=1248676019499049763, language=CN, stringName=陈正虎, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1.中国长江电力股份有限公司, 湖北 武汉 430014, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1248676019037676290, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, xref=null, ext=[AuthorCompanyExt(id=1248676019050259203, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, companyId=1248676019037676290, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1. China Yangtze Power Co., Ltd., Wuhan 430014, Hubei, China), AuthorCompanyExt(id=1248676019067036421, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, companyId=1248676019037676290, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1.中国长江电力股份有限公司, 湖北 武汉 430014)])]), Author(id=1248676019654239024, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, orderNo=3, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1248676019712959285, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, authorId=1248676019654239024, language=EN, stringName=Heng ZHONG, firstName=Heng, middleName=null, lastName=ZHONG, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1. China Yangtze Power Co., Ltd., Wuhan 430014, Hubei, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1248676019763290937, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, authorId=1248676019654239024, language=CN, stringName=钟恒, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1.中国长江电力股份有限公司, 湖北 武汉 430014, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1248676019037676290, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, xref=null, ext=[AuthorCompanyExt(id=1248676019050259203, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, companyId=1248676019037676290, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1. China Yangtze Power Co., Ltd., Wuhan 430014, Hubei, China), AuthorCompanyExt(id=1248676019067036421, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, companyId=1248676019037676290, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1.中国长江电力股份有限公司, 湖北 武汉 430014)])]), Author(id=1248676019809428284, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, orderNo=4, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1248676019872342848, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, authorId=1248676019809428284, language=EN, stringName=Yuangeng LYU, firstName=Yuangeng, middleName=null, lastName=LYU, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=2, address=2. State Key Laboratory of Hydraulic Engineering Intelligent Construction and Operation, Tianjin University, Tianjin 300350, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1248676019918480196, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, authorId=1248676019809428284, language=CN, stringName=吕沅庚, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=2, address=2.天津大学 水利工程智能建设与运维全国重点实验室, 天津 300350, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1248676019108979464, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, xref=null, ext=[AuthorCompanyExt(id=1248676019125756681, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, companyId=1248676019108979464, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2. State Key Laboratory of Hydraulic Engineering Intelligent Construction and Operation, Tianjin University, Tianjin 300350, China), AuthorCompanyExt(id=1248676019138339594, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, companyId=1248676019108979464, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2.天津大学 水利工程智能建设与运维全国重点实验室, 天津 300350)])]), Author(id=1248676019964617543, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, orderNo=5, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1248676020031726410, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, authorId=1248676019964617543, language=EN, stringName=Jingyi FENG, firstName=Jingyi, middleName=null, lastName=FENG, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=2, address=2. State Key Laboratory of Hydraulic Engineering Intelligent Construction and Operation, Tianjin University, Tianjin 300350, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1248676020082058062, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, authorId=1248676019964617543, language=CN, stringName=封婧仪, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=2, address=2.天津大学 水利工程智能建设与运维全国重点实验室, 天津 300350, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1248676019108979464, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, xref=null, ext=[AuthorCompanyExt(id=1248676019125756681, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, companyId=1248676019108979464, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2. State Key Laboratory of Hydraulic Engineering Intelligent Construction and Operation, Tianjin University, Tianjin 300350, China), AuthorCompanyExt(id=1248676019138339594, tenantId=1045748351789510663, journalId=1221126710357164034, articleId=1248596683932705387, companyId=1248676019108979464, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2.天津大学 水利工程智能建设与运维全国重点实验室, 天津 300350)])])]
邓旭方,刘乐平,陈正虎,钟恒,吕沅庚,封婧仪.
面向图片数据的混凝土材料文本智能识别与分析[J].
水利水电技术(中英文), 2025, 56(S1): 85-94 DOI:10.13928/j.cnki.wrahe.2025.S1.014
基金资助
中国长江电力股份有限公司科研项目(Z212302036)