基于大语言模型的华南稀有金属矿床知识图谱构建
Construction of a Knowledge Graph for Rare Metal Deposits in South China Based on Large Language Models
,
,
,
以大语言模型为代表的新一代人工智能技术为地学知识的结构化表达与智能推理提供了新机遇.针对地质领域知识体系复杂,非结构化文本语义分散、难以再利用和可视化的问题,本文以华南稀有金属矿床为研究对象,提出了一种融合矿床成因与找矿标志的统一知识图谱构建策略.研究基于DeepSeek R1-32B大语言模型与提示词工程,从大量地质文献中自动抽取并构建了涵盖Li、Be、Nb、Ta等关键稀有矿种的知识图谱.知识图谱及其拓展性分析的结果表明,华南稀有金属成矿与印支期、燕山期岩浆活动密切相关,具有显著的高分异与岩浆热液作用特征;稀有金属元素呈现Li-Be-Nb-Ta-W-Sn的组合异常.综上所述,基于大语言模型构建的知识图谱揭示了华南稀有金属成矿的多阶段成矿机制,阐明了稀有金属矿床在地球化学异常、构造控制及蚀变分带方面的内在联系,为华南及邻区的稀有金属勘查提供了智能化研究方案.
The new generation of artificial intelligence technologies, represented by Large Language Models (LLMs), provides new opportunities for the structured representation and intelligent reasoning of geological knowledge. To address the challenges posed by the complexity of geoscientific knowledge systems, as well as the semantic fragmentation, limited reusability, and poor visualizability of unstructured texts, this study proposes a unified strategy for constructing a knowledge graph that integrates deposit genesis and prospecting indicators, taking rare metal deposits in South China as a study object. Based on the DeepSeek R1-32B large language model and prompt engineering, a knowledge graph covering key rare metal elements such as Li, Be, Nb, and Ta, is automatically extracted and constructed. The knowledge graph construction and its extensibility analysis indicate that rare metal mineralization in South China is closely associated with Indosinian and Yanshanian magmatic activities, characterized by significant high-degree fractionation and magmatic-hydrothermal processes. Rare metal elements exhibit a combinatorial anomaly of Li-Be-Nb-Ta-W-Sn. It is concluded that the knowledge graph constructed using LLMs reveals the multi-stage metallogenic mechanisms of rare metals in South China, clarifies the intrinsic relationships among geochemical anomalies, structural controls, and alteration zoning of rare metal deposits, and provides an intelligent research framework for the exploration of rare metals in South China and adjacent regions.
| [1] |
Bikel, D. M., Miller, S., Schwartz, R., et al.,1997. March. Nymble: A High⁃Performance Learning Name⁃Finder. Fifth Conference on Applied Natural Language Processing, Washington, D.C.. |
| [2] |
Cheng, Q. M., 2021. What Are Mathematical Geosciences and Its Frontiers? Earth Science Frontiers, 28(3): 6-25 (in Chinese with English abstract). |
| [3] |
Cracknell, M. J., Reading, A. M., 2014. Geological Mapping Using Remote Sensing Data: A Comparison of Five Machine Learning Algorithms, Their Response to Variations in the Spatial Distribution of Training Data and the Use of Explicit Spatial Information. Computers & Geosciences, 63: 22-33. https://doi.org/10.1016/j.cageo.2013.10.008 |
| [4] |
Deng, Y. Y., Song, S. C., Fan, J. X., et al., 2024. Paleontology Knowledge Graph for Data⁃Driven Discovery. Journal of Earth Science, 35(3): 1024-1034. https://doi.org/10.1007/s12583⁃023⁃1943⁃9 |
| [5] |
Devlin, J., Chang, M. W., Lee, K., et al., 2019. BERT: Pre⁃Training of Deep Bidirectional Transformers for Language Understanding. NAACL⁃HLT 2019, Minneapolis. |
| [6] |
Dong, J., Qiu, Q. J., Xie, Z., et al., 2023. Understanding Table Content for Mineral Exploration Reports Using Deep Learning and Natural Language Processing. Ore Geology Reviews, 156: 105383. https://doi.org/10.1016/j.oregeorev.2023.105383 |
| [7] |
Grover, A., Leskovec, J., 2016. Node2Vec: Scalable Feature Learning for Networks. The 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco. https://doi.org/10.1145/2939672.2939754 |
| [8] |
Guan, N. N., Song, D. D., Liao, L. J., 2019. Knowledge Graph Embedding with Concepts. Knowledge⁃Based Systems, 164: 38-44. https://doi.org/10.1016/j.knosys.2018.10.008 |
| [9] |
Hou, Z. D., Zhao, Z., Liu, Z. J., et al., 2023. Metallogenetic Regularity and Prospecting Direction of Granite Related Li⁃Be⁃Nb⁃Ta Deposits in the Nanling Region, South China. Acta Petrologica Sinica, 39(7): 1950-1972 (in Chinese with English abstract). |
| [10] |
Lahitani, A. R., Permanasari, A. E., Setiawan, N. A., 2016. Cosine Similarity to Determine Similarity Measure: Study Case in Online Essay Assessment. 2016 4th International Conference on Cyber and IT Service Management. Bandung. https://doi.org/10.1109/CITSM.2016.7577578 |
| [11] |
Lawley, C. J. M., Gadd, M. G., Parsa, M., et al., 2023. Applications of Natural Language Processing to Geoscience Text Data and Prospectivity Modeling. Natural Resources Research, 32(4): 1503-1527. https://doi.org/10.1007/s11053⁃023⁃10216⁃1 |
| [12] |
Li, L. S., Zhou, R. P., Huang, D. G., 2009. Two⁃Phase Biomedical Named Entity Recognition Using CRFS. Computational Biology and Chemistry, 33(4): 334-338. https://doi.org/10.1016/j.compbiolchem.2009.07.004 |
| [13] |
Li, X. F., Wei, X. L., Zhu, Y. T., et al., 2021. Rare Metal Deposits in South China: Types, Characteristics, Distribution and Tectonic Setting. Acta Petrologica Sinica, 37(12): 3591-3614 (in Chinese with English abstract). |
| [14] |
Lima, L. A., Görnitz, N., Varella, L. E., et al., 2017. Porosity Estimation by Semi⁃Supervised Learning with Sparsely Available Labeled Samples. Computers & Geosciences, 106: 33-48. https://doi.org/10.1016/j.cageo.2017.05.004 |
| [15] |
Ma, X. H., Wang, H. H., Lehmann, B., et al., 2024. Control of Magmatic Halogen Composition and Redox State on the Zonation of Metal Mineralization across Active Continental Margins: Perspectives from the World⁃Class South China Metallogenic Province. Chemical Geology, 669: 122363. https://doi.org/10.1016/j.chemgeo.2024.122363 |
| [16] |
Ma, X. H., Yan, J. Y., Wang, H. H., et al., 2025. Decoupling of Tungsten and Tin Mineralization Governed by Magmatic Fluorine. Scientia Sinica (Terrae), 55(8): 2583-2602 (in Chinese with English abstract). |
| [17] |
Ma, X. Z., Hovy, E., 2016. End⁃to⁃End Sequence Labeling via Bi⁃Directional LSTM⁃CNNS⁃CRF. arXiv, 1603.01354. https://arxiv.org/abs/1603.01354 |
| [18] |
Maynez, J., Narayan, S., Bohnet, B., et al., 2020. On Faithfulness and Factuality in Abstractive Summarization. The 58th Annual Meeting of the Association for Computational Linguistics. Online. https://doi.org/10.18653/v1/2020.acl⁃main.173 |
| [19] |
Marrero, M., Urbano, J., Sánchez⁃Cuadrado, S., et al., 2013. Named Entity Recognition: Fallacies, Challenges and Opportunities. Computer Standards & Interfaces, 35(5): 482-489. https://doi.org/10.1016/j.csi.2012.09.004 |
| [20] |
Ou, Q., Liu, J. Y., Zi, F., et al., 2025. Late Mesozoic Wangxiang Composite Granitic Pluton, South China Block: Implications to Magma Emplacement and Evolution from Geochemical Proxies. Journal of Earth Science, 36(2): 485-507. https://doi.org/10.1007/s12583⁃022⁃1760⁃8 |
| [21] |
Qin, J. H., Wang, D. H., Wang, Y., et al., 2023. Metallogenci Law and Exploration Prospect of the Middle Part of Nanling Metallogenic Belt, South China Block. Geology in China, Online (in Chinese with English abstract). https://link.cnki.net/urlid/11.1167.P.20230811.1649.002 |
| [22] |
Qiu, Q., Ma, K., Lü, H., et al., 2023. Construction and Application of a Knowledge Graph for Iron Deposits Using Text Mining Analytics and a Deep Learning Algorithm. Mathematical Geosciences, 55(3): 423-456. https://doi.org/10.1007/s11004⁃023⁃10050⁃4 |
| [23] |
Song, S. W., Mao, J. W., Romer, R. L., et al., 2024. Hosts of Sn in Reduced Deep⁃Seated W Skarn Systems: A Case Study on the World⁃Class Scheelite Skarn Deposit, Zhuxi, South China. Mineralium Deposita, 59(7): 1433-1454. https://doi.org/10.1007/s00126⁃024⁃01271⁃6 |
| [24] |
Wang, C. B., Ma, X. G., Chen, J. G., et al., 2018. Information Extraction and Knowledge Graph Construction from Geoscience Literature. Computers & Geosciences, 112: 112-120. https://doi.org/10.1016/j.cageo.2017.12.007 |
| [25] |
Wu, F. Y., Guo, C. L., Hu, F. Y., et al., 2023. Petrogenesis of the Highly Fractionated Granites and Their Mineralizations in Nanling Range, South China. Acta Petrologica Sinica, 39(1): 1-36 (in Chinese with English abstract). |
| [26] |
Wu, Q. H., Zhou, H. X., Liu, B., et al., 2023. Spatio⁃Temporal Distribution of Granite⁃Related Rare Metal Deposits and W⁃Sn Deposits in South China and Their Genetic Relationship. Bulletin of Geological Science and Technology, 42(1): 78-88 (in Chinese with English abstract). |
| [27] |
Xiao, F., Chen, Z. J., Chen, J. G., et al., 2016. A Batch Sliding Window Method for Local Singularity Mapping and Its Application for Geochemical Anomaly Identification. Computers & Geosciences, 90: 189-201. https://doi.org/10.1016/j.cageo.2015.11.001 |
| [28] |
Xiao, W. Z., Liu, C. Y., Tan, K. X., et al., 2025. Zircon U⁃Pb⁃Hf and Trace Element Signatures Reveal the Petrogenesis of the Jiuyishan Granitic Complex, South China: Implications for W⁃Sn and Rare Metal Mineralization. Journal of Earth Science, 36(3): 1069-1089. https://doi.org/10.1007/s12583⁃023⁃1842⁃2 |
| [29] |
Yan, Q., Xue, L. F., Li, Y. S., et al., 2023. Mineral Prospectivity Mapping Integrated with Geological Map Knowledge Graph and Geochemical Data: A Case Study of Gold Deposits at Raofeng Area, Shaanxi Province. Ore Geology Reviews, 161: 105651. https://doi.org/10.1016/j.oregeorev.2023.105651 |
| [30] |
Yang, J. H., Wu, J. H., Zhou, M. F., et al., 2025. Mantle Contributions to Global Tungsten Recycling and Mineralization. Communications Earth & Environment, 6: 510. https://doi.org/10.1038/s43247⁃025⁃02471⁃2 |
| [31] |
Zhang, R. Z., Zuo, R. G., 2025. Comparison of the Geochemical Characteristics of Highly Fractionated Granites in South China and the Himalaya. Acta Petrologica Sinica, 41(4): 1424-1441 (in Chinese with English abstract). |
| [32] |
Zhang, Z. J., Cheng, Q. M., Yang, J., et al., 2021. Machine Learning for Mineral Prospectivity: A Case Study of Iron⁃Polymetallic Mineral Prospectivity in Southwestern Fujian. Earth Science Frontiers, 28(3): 221-235 (in Chinese with English abstract). |
| [33] |
Zhang, Z. J., Yang, Z. X., Jian, F. Y., et al., 2026. Interpretability⁃Enhanced Mineral Prospectivity Models: A Synergistic Approach Using Large Language Models, Knowledge Graphs, and Machine Learning. Mathematical Geosciences, 58(1): 195-231. https://doi.org/10.1007/s11004⁃025⁃10231⁃3 |
| [34] |
Zhao, M. L., Zhang, Z. J., Yang, J., et al., 2025. Knowledge Graph Construction and Knowledge Discovery for Porphyry Copper Deposits. Ore Geology Reviews, 186: 106875. https://doi.org/10.1016/j.oregeorev.2025.106875 |
| [35] |
Zhao, Y., He, J. W., Zhu, S. C., et al., 2024. Security of Large Language Models: Current Status and Challenges. Computer Science, 51(1): 68-71 (in Chinese with English abstract). |
| [36] |
Zhao, Z., Chen, Y. C., Wang, D. H., et al., 2022. Transformation of Mesozoic Dynamic Systems and Superposition of Metallogenic Series of W⁃Sn⁃Li⁃Be⁃Nb⁃Ta⁃REE Mineral Deposits in South China. Acta Petrologica Sinica, 38(2): 301-322 (in Chinese with English abstract). |
| [37] |
Zhou, Y. Z., Zuo, R. G., Liu, G., et al., 2021. The Great⁃Leap⁃Forward Development of Mathematical Geoscience during 2010-2019: Big Data and Artificial Intelligence Algorithm Are Changing Mathematical Geoscience. Bulletin of Mineralogy, Petrology and Geochemistry, 40(3): 556-573, 777 (in Chinese with English abstract). |
| [38] |
Zhu, Y. S., 2006. Basic Theory of Mineral Resources Assessment⁃Theory System between Regional Metallogeny to Mineral Exporation. Acta Geologica Sinica, 80(10): 1518-1527 (in Chinese with English abstract). |
| [39] |
Zhu, Z. S., Zhu, L., 1998. Prediction Theory and Method System of Deposits. Journal of Chengdu University of Technology (Science & Technology Edition), 25(S1): 6-12 (in Chinese with English abstract). |
广东省重点矿种找矿靶区优选及找矿目标定位项目(2024⁃47)
广东省自然科学基金项目(2024A1515030216)
/
| 〈 |
|
〉 |