融合知识图谱与大语言模型的地学知识抽取与信息挖掘:以卡林型金矿为例
Geological Knowledge Extraction and Information Mining via the Fusion of Knowledge Graphs and LLMs: A Case Study of Carlin-Type Gold Deposits
,
针对地质勘查领域海量非结构化数据难以被有效利用以及通用大模型存在“事实幻觉”与专业逻辑匮乏等问题,本文提出了一种融合知识图谱(KG)与检索增强生成(RAG)的垂直领域智能知识挖掘框架,并以中国黔西南与美国内华达地区的卡林型金矿成矿规律总结和对比研究为例进行了验证.首先,构建了基于本地化部署DeepSeek-32B的RAG智能问答系统,通过向量检索与生成式阅读理解,实现了专业知识的精准溯源与高可信问答.其次,利用大模型监督微调(SFT)技术,从数百份多源异构地质资料中高效构建了系统涵盖地层构造、蚀变矿物及控矿要素的跨区域成矿知识图谱.实验结果表明,该系统在客观准确性上显著优于GPT-4o,在主观生成上具备高忠实度与完全可溯源性,有效解决了幻觉问题.基于图谱拓扑学的分析不仅定量揭示了两地成矿的宏观异同,还量化了从矿石实体、蚀变组合到地球化学元素异常的级联指示路径,证实了其发现隐性找矿线索的能力.该研究实现了从非结构化文本到结构化知识的智能转化与深度挖掘,为解决地学领域“海量数据、知识饥饿”困境提供了新的技术路径.
To address the challenges in effectively utilizing massive unstructured data within geological exploration and the issues of hallucination and lack of specialized logic in general Large Language Models (LLMs), we propose an intelligent knowledge mining framework for vertical domains by integrating Knowledge Graph (KG) and Retrieval-Augmented Generation (RAG). This framework is validated through a comparison case study of Carlin-type gold deposits in the Southwest Guizhou, China, and in Nevada, USA. Firstly, a RAG-based intelligent question-answering system was constructed using a locally deployed DeepSeek-32B model. Through vector retrieval and generative reading comprehension, the system achieved precise traceability of professional knowledge and highly reliable Question & Answer (Q&A). Secondly, leveraging Supervised Fine-Tuning (SFT) techniques on the LLM, we developed a cross-regional metallogenic knowledge graph systematically covering stratigraphy, structure, alteration minerals, and ore-controlling factors based on hundreds of multi-sources, and heterogeneous geological documents. The results demonstrate that the proposed system significantly outperforms GPT-4o in terms of objective accuracy. For subjective content generation, it exhibits high faithfulness, full traceability and effectively mitigate the hallucination. Analyses based on graph topology not only quantitatively reveal the macroscopic similarities and differences of Au mineralization between the two regions but also quantify the cascading indicative pathways⁃from orebody entities and alteration assemblages to geochemical element anomalies, confirming the system’s capability to discover implicit clues for mineral exploration. This study realizes the intelligent transformation and in-depth mining of knowledge from unstructured text to structured representations, offering a novel technical pathway to address the dilemma of "data-rich yet knowledge-poor" prevalent in the geoscience domain.
| [1] |
Cao, S. T., Hu, R. Z., Zhou, Y. Z., et al., 2026. Analysis of Trajectories and Developmental Prospects of Research on Carlin⁃Type Gold Deposits on the Basis of Big Data Community Detection Algorithms. Ore Geology Reviews, 188: 106989. https://doi.org/10.1016/j.oregeorev.2025.106989 |
| [2] |
Chen, W. B., Wei, B. G., Yang, T. C., et al., 2009. Geological Character and Prospecting Potential of Nibao Gold Deposit in Pu’an County, Guizhou. Guizhou Geology, 26(3): 170-176 (in Chinese with English abstract). |
| [3] |
Cheng, Q. M., 2025. A New Paradigm for Mineral Resource Prediction Based on Human Intelligence⁃Artificial Intelligence Integration. Earth Science Frontiers, 32(4): 1-19 (in Chinese with English abstract). |
| [4] |
Deng, J., Wang, Q. F., 2016. Gold Mineralization in China: Metallogenic Provinces, Deposit Types and Tectonic Framework. Gondwana Research, 36: 219-274. https://doi.org/10.1016/j.gr.2015.10.003 |
| [5] |
Dong, Y. H., Wang, Y. Z., Tian, J. T., et al., 2025. Research Progress on Porphyry Copper Deposit Prediction Based on Knowledge Graphs. Earth Science Frontiers, 32(4): 280-290 (in Chinese with English abstract). |
| [6] |
Feng, T. T., Cai, S. R., Zhang, Z. J., 2025. Mining Elements of Carbonatite⁃Type Rare Earth Deposits Based on Knowledge Map. Earth Science Frontiers, 32(4): 262-279 (in Chinese with English abstract). |
| [7] |
Fu, Y., Wang, M. G., Wang, C. B., et al., 2025. GeoMinLM: A Large Language Model in Geology and Mineral Survey in Yunnan Province. Ore Geology Reviews, 182: 106638. https://doi.org/10.1016/j.oregeorev.2025.106638 |
| [8] |
Goldfarb, R., Qiu, K., Deng, J., et al., 2019. Orogenic Gold Deposits of China. Geological Society, London, Special Publications, 480:263-288. https://doi.org/10.1144/SP480⁃2018⁃175 |
| [9] |
Hofstra, A. H., Christensen, O. D., 2002. Comparison of Carlin⁃Type Au Deposits in the United States, China, and Indonesia⁃Implications for Genetic Models and Exploration. US Geological Survey Open⁃File Report, 2-131. |
| [10] |
Hofstra, A. H., Cline, J. S., 2000. Characteristics and Models for Carlin⁃Type Gold Deposits. Reviews in Economic Geology, 13:163-220. https://doi.org/10.5382/Rev.13.05 |
| [11] |
Hu, R. Z., Fu, S. L., Huang, Y., et al., 2017. The Giant South China Mesozoic Low⁃Temperature Metallogenic Domain: Reviews and a New Geodynamic Model. Journal of Asian Earth Sciences, 137: 9-34. https://doi.org/10.1016/j.jseaes.2016.10.016 |
| [12] |
Hu, Y. J., Mai, G. C., Cundy, C., et al., 2023. Geo⁃Knowledge⁃Guided GPT Models Improve the Extraction of Location Descriptions from Disaster⁃Related Social Media Messages. International Journal of Geographical Information Science, 37(11): 2289-2318. https://doi.org/10.1080/13658816.2023.2266495 |
| [13] |
Jiang, Z. Y., Zhong, L., Sun, M. S., et al., 2024. Efficient Knowledge Infusion via KG⁃LLM Alignment. arXiv, 2406.03746. https://arxiv.org/abs/2406.03746 |
| [14] |
Lewis, P., Perez, E., Piktus, A., et al., 2020. Retrieval⁃ Augmented Generation for Knowledge⁃Intensive NLP Tasks. arXiv, 2005.11401. https://arxiv.org/abs/2005.11401 |
| [15] |
Li, B. W., Wang, Y. Z., Ding, Z. J., et al., 2025. Intelligent Search Technology for Jiaodong Gold Deposits Based on Large Models and GraphRAG. Earth Science Frontiers, 32(4): 155-164 (in Chinese with English abstract). |
| [16] |
Li, G. Z., Wang, P., Ke, W. J., 2023. Revisiting Large Language Models as Zero⁃Shot Relation Extractors. arXiv, 2310.05028. https://arxiv.org/abs/2310.05028 |
| [17] |
Ma, X. G., 2022. Knowledge Graph Construction and Application in Geosciences: A Review. Computers & Geosciences, 161: 105082. https://doi.org/10.1016/j.cageo.2022.105082 |
| [18] |
Mao, B. J., Ran, R. D., Kuang, S. D., et al., 2018. Genesis of the Getang Gold Deposit in the Southwest Guizhou. Contributions to Geology and Mineral Resources Research, 33(2): 168-175 (in Chinese with English abstract). |
| [19] |
Peng, J. J., Lin, K., 2024. Knowledge Graph Analysis of Mineralization Laws Research of Lithium Ore. China Mining Magazine, 33(9): 228-235 (in Chinese with English abstract). |
| [20] |
Pi, Q. H., Hu, R. Z., Xiong, B., et al., 2017. In Situ SIMS U⁃Pb Dating of Hydrothermal Rutile: Reliable Age for the Zhesang Carlin⁃Type Gold Deposit in the Golden Triangle Region, SW China. Mineralium Deposita, 52(8): 1179-1190. https://doi.org/10.1007/s00126⁃017⁃0715⁃y |
| [21] |
Qiu, Q. J., Tian, M., Wu, Q. R., et al., 2025. Construction and Application of Geological Knowledge Graph Based on Multi⁃Source Heterogeneous Data. Earth Science Frontiers, Online. (in Chinese with English abstract). https://doi.org/10.13745/j.esf.sf.2024.11.69 |
| [22] |
Raiaan, M. A. K., Mukta, M. S. H., Fatema, K., et al., 2024. A Review on Large Language Models: Architectures, Applications, Taxonomies, Open Issues and Challenges. IEEE Access, 12: 26839-26874. https://doi.org/10.1109/ACCESS.2024.3365742 |
| [23] |
Ran, R. D., 2005. Characteristic and Metallogenic Mechanism of the Gold Deposits with Karst Structure as Holding Ore Space in the Southwest of Guizhou⁃Taking the Getang Gold Deposit in Anlong as an Example. Guizhou Geology, 22(1): 14-21 (in Chinese with English abstract). |
| [24] |
Shi, L. Y., Zuo, R. G., 2026. Foundation Model for Mineral Prospectivity Mapping. Earth Science, 51(3): (in Chinese with English abstract). |
| [25] |
Sutanto, P., Santoso, J., Setiawan, E. I., et al., 2024. LLM Distillation for Efficient Few⁃Shot Multiple Choice Question Answering. arXiv, 2412.09807. https://arxiv.org/abs/2412.09807 |
| [26] |
Tan, S. M., Shi, G. D., Lei, L. Q., et al., 2007. Carlin⁃Type Gold Deposits Distribution and Prospecting in China. Geological Survey and Research, 30(4): 289-294 (in Chinese with English abstract). |
| [27] |
Tao, P., Li, P. G., Li, K. Q., 2002. The Structure of the Deposits of the Nibao Goldfield and Its Relationship with Metallogenesis. Guizhou Geology, 19(4): 221-227 (in Chinese with English abstract). |
| [28] |
Tian, S. Y., Luo, Y. Y., Xu, T. Z., et al., 2024. KG⁃Adapter: Enabling Knowledge Graph Integration in Large Language Models through Parameter⁃Efficient Fine⁃Tuning. Annual Meeting of the Association for Computational Linguistics, Bangkok. https://doi.org/10.18653/v1/2024.findings⁃acl.229 |
| [29] |
Wang, C. B., Wang, M. G., Wang, B., et al., 2024. Knowledge Graph⁃Infused Quantitative Mineral Resource Forecasting. Earth Science Frontiers, 31(4): 26-36 (in Chinese with English abstract). |
| [30] |
Wang, Q. F., Groves, D., 2018. Carlin⁃Style Gold Deposits, Youjiang Basin, China: Tectono⁃Thermal and Structural Analogues of the Carlin⁃Type Gold Deposits, Nevada, USA. Mineralium Deposita, 53(7): 909-918. https://doi.org/10.1007/s00126⁃018⁃0837⁃x |
| [31] |
Xie, Z. J., Xia, Y., Cline, J. S., et al., 2018. Are There Carlin⁃Type Gold Deposits in China? A Comparison of the Guizhou, China, Deposits with Nevada, USA, Deposits. Reviews in Economic Geology, 20:187-233. https://doi.org/10.5382/rev.20.06 |
| [32] |
Xie, Z. J., Xia, Y., Cline, J., et al., 2019. A Comparison between Carlin⁃Type Au Deposits in Guizhou of China and Nevada of the USA and Its Implications for Exploration. Mineral Deposits, 38(5): 1077-1093 (in Chinese with English abstract). |
| [33] |
Yang, X., Sun, L., Liu, M. L., et al., 2025. Knowledge Graph Construction with BERT⁃BiLSTM⁃IDCNN⁃CRF and Graph Algorithms for Metallogenic Pattern Discovery: A Case Study of Pegmatite⁃Type Lithium Deposits in China. Ore Geology Reviews, 179: 106514. https://doi.org/10.1016/j.oregeorev.2025.106514 |
| [34] |
Zhang, B. W., Soh, H., 2024. Extract, Define, Canonicalize: An LLM⁃Based Framework for Knowledge Graph Construction. arXiv, 2404.03868. https://arxiv.org/abs/2404.03868 |
| [35] |
Zhang, B. Y., Tang, J. C., Zhang, T. Y., et al., 2026. Knowledge Graph and Question⁃Answering Model for Geological Prospecting Empowered by Large Language Models. Earth Science, 51(3): 982-995 (in Chinese with English abstract). |
| [36] |
Zhang, Y. F., Wei, C., He, Z. T., et al., 2024. GeoGPT: An Assistant for Understanding and Processing Geospatial Tasks. International Journal of Applied Earth Observation and Geoinformation, 131: 103976. https://doi.org/10.1016/j.jag.2024.103976 |
| [37] |
Zhang, Z. J., Yang, Z. X., Jian, F. Y., et al., 2025. Interpretability⁃Enhanced Mineral Prospectivity Models: A Synergistic Approach Using Large Language Models, Knowledge Graphs, and Machine Learning. Mathematical Geosciences, Online. https://doi.org/10.1007/s11004⁃025⁃10231⁃3 |
| [38] |
Zhao, M. L., Zhang, Z. J., Yang, J., et al., 2025. Knowledge Graph Construction and Knowledge Discovery for Porphyry Copper Deposits. Ore Geology Reviews, 186: 106875. https://doi.org/10.1016/j.oregeorev.2025.106875 |
| [39] |
Zhao, P. D., 2019. Characteristics and Rational Utilization of Geological Big Data. Earth Science Frontiers, 26(4): 1-5 (in Chinese with English abstract). |
| [40] |
Zhou, Y. Z., Zhang, Q. L., Huang, Y. J., et al., 2021a. Constructing Knowledge Graph for the Porphyry Copper Deposit in the Qingzhou⁃Hangzhou Bay Area: Insight into Knowledge Graph Based Mineral Resource Prediction and Evaluation. Earth Science Frontiers, 28(3): 67-75 (in Chinese with English abstract). |
| [41] |
Zhou, Y. Z., Zuo, R. G., Liu, G., et al., 2021b. The Great⁃Leap⁃Forward Development of Mathematical Geoscience during 2010-2019: Big Data and Artificial Intelligence Algorithm Are Changing Mathematical Geoscience. Bulletin of Mineralogy, Petrology and Geochemistry, 40(3): 556-573, 777 (in Chinese with English abstract). |
国家深地重大专项青年科学家课题(2024ZD10019007)
贵州省地质矿产局地质科研项目(黔地质科合〔2025〕01号)
中央高校基本科研业务费专项资金资助项目(GUG⁃DMX2025⁃01)
国家级大学生创新训练计划资助项目(202510491034)
/
| 〈 |
|
〉 |