水工混凝土材料非结构化文本解析与表格数据库构建

杨华美 ,  刘乐平 ,  李文伟 ,  邓旭方 ,  李曙光 ,  陈正虎 ,  邓伦

水利水电技术(中英文) ›› 2025, Vol. 56 ›› Issue (S2) : 66 -70.

PDF (12036KB)
水利水电技术(中英文) ›› 2025, Vol. 56 ›› Issue (S2) : 66 -70. DOI: 10.13928/j.cnki.wrahe.2025.S2.016
知识驱动的长江大保护智慧EPC管控技术专栏

水工混凝土材料非结构化文本解析与表格数据库构建

作者信息 +

Unstructured text analysis and table database construction of hydraulic concrete materials

Author information +
文章历史 +
PDF (12324K)

摘要

在水利工程历史建设过程中,受到文本信息化水平的限制,积累了大量以纸质文本和扫描图像形式保存的水工混凝土材料不可编辑文档,难以直接有效利用材料数据,极大增加了材料知识应用的难度。提出一种基于机器视觉和深度学习的文档解析方法,准确高效地将水工混凝土材料文本信息和表格数据转化为可编辑形式。进一步,基于已解译的表格信息,构建了水工混凝土材料表格数据库,实现了混凝土材料数据的高效查询和统一管理。以实际工程的水工混凝土材料文档为例验证新方法的可行性,结果表明,文档解析方法各项子任务的准确率均达90%以上,有助于混凝土材料不可编辑资源的自动化再利用。

Abstract

In the process of historical construction of water conservancy projects, limited by the level of text informatization, a large number of non-editable documents of hydraulic concrete materials have been accumulated in the form of paper texts and scanned images, making it difficult to directly and effectively utilize material data, greatly increasing the difficulty of applying material knowledge. A document parsing method was proposed based on machine vision and deep learning, which accurately and efficiently converts the text information and table data of hydraulic concrete materials into editable form. Furthermore, based on the interpreted table information, a database of hydraulic concrete material tables was constructed, achieving efficient querying and unified management of concrete material data. Taking the actual engineering hydraulic concrete material document as an example to verify the feasibility of new method, the result show that the accuracy of each subtask of the document parsing method is over 90%, which is helpful for the automated reuse of non-editable resources of concrete materials and improves the data service capability in the field of water conservancy engineering.

关键词

水工混凝土材料 / 版面结构划分 / 文本检测与识别 / 表格数据库

Key words

hydraulic concrete materials / layout structure division / text detection and recognition / table database

引用本文

引用格式 ▾
杨华美,刘乐平,李文伟,邓旭方,李曙光,陈正虎,邓伦. 水工混凝土材料非结构化文本解析与表格数据库构建[J]. 水利水电技术(中英文), 2025, 56(S2): 66-70 DOI:10.13928/j.cnki.wrahe.2025.S2.016

登录浏览全文

4963

注册一个新账户 忘记密码

参考文献

基金资助

中国长江电力股份有限公司科研项目资助(Z212302036)

AI Summary AI Mindmap
PDF (12036KB)

0

访问

0

被引

详细

导航
相关文章

AI思维导图

/