1 Key Laboratory of Marine Reservoir Evolution and Hydrocarbon Enrichment Mechanism(Ministry of Education), School of Energy Resources,China University of Geosciences(Beijing),Beijing 100083,China
2 Research Institute of Petroleum Exploration and Development,PetroChina, Beijing 100083,China
LI Shengli,born in 1971,Ph.D.,is a professor and doctoral supervisor at School of Energy,China University of Geosciences(Beijing), and he is mainly engaged in sedimentary reservoir and development geology. E-mail: slli@cugb.edu.cn.
YAN Jiafei,born in 1999,is a graduate student of China University of Geosciences(Beijing), and he specializes in geological resources and geological engineering. E-mail: 1275585596@qq.com.
The identification and prediction of shale lithofacies are crucial for identifying favorable intervals(“sweet spots”)in shale oil and gas reservoirs. In the absence of core data,logging data plays a key role in lithofacies analysis at the single-well level. By applying the XGBoost algorithm,useful lithofacies information can be extracted from multidimensional logging data,enabling effective prediction of shale lithofacies in individual wells. In this study,the XGBoost machine learning method,a supervised learning algorithm,is used to build a predictive model based on conventional logging datasets. First,a lithofacies classification scheme tailored to the specific study area is established,which captures the variability in shale lithofacies identification. The boundaries of mineral compositions and TOC content for different lithofacies types are then determined using statistical proportion analysis. During model construction,care must be taken to eliminate redundant variables,as highly correlated features may provide overlapping information and cause overfitting. XGBoost's grid search approach allows comprehensive parameter tuning. Multiple rounds of optimization should be conducted,with the search range gradually narrowed to determine the optimal parameter set. Using the Zanzijing block in the Songnan area as a case study,five major shale lithofacies types are defined based on mineral composition,sedimentary structures,and TOC content. During variable selection,for instance,only one of the highly correlated LLD and LLS logs is retained,which results in a model accuracy improvement of approximately 4%. After feature selection and parameter tuning,the final model achieves a lithofacies prediction accuracy of up to 90.03%.
[Author(id=1273216383703076898, tenantId=1045748351789510663, journalId=1155139928303341606, articleId=1159892694630851300, orderNo=0, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=1275585596@qq.com, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1273216383765991464, tenantId=1045748351789510663, journalId=1155139928303341606, articleId=1159892694630851300, authorId=1273216383703076898, language=EN, stringName=Jiafei YAN, firstName=Jiafei, middleName=null, lastName=YAN, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1 Key Laboratory of Marine Reservoir Evolution and Hydrocarbon Enrichment Mechanism(Ministry of Education), School of Energy Resources,China University of Geosciences(Beijing),Beijing 100083,China, bio={"content":"
YAN Jiafei,born in 1999,is a graduate student of China University of Geosciences(Beijing), and he specializes in geological resources and geological engineering. E-mail: 1275585596@qq.com.
"}, bioImg=null, bioContent=
YAN Jiafei,born in 1999,is a graduate student of China University of Geosciences(Beijing), and he specializes in geological resources and geological engineering. E-mail: 1275585596@qq.com.
XGBoost算法是在对梯度提升算法的大量研究基础上提出的一种基于提升树的机器学习系统(Chen and Guestrin, 2016),是一种已经被证实的表现力很好的有监督机器学习算法。该算法一般不适合过于稀疏和单一维度的数据,因其可能导致模型难以学习到有效信息(Zhu et al., 2021);但对于多维数据,该算法具有准确性高、灵活性强、可扩展性好、计算速度快等特点而被广泛应用(Zheng et al., 2022)。XGBoost算法适合处理复杂的非线性问题,尤其是适合处理具有多个特征的数据。同时,对含有噪声的数据具有一定的鲁棒性(李占山和刘兆赓,2019)。所以,多种(条)测井曲线可以充分发挥XGBoost算法的优点,充分捕捉测井曲线有效信息并与页岩岩相建立联系,从而达到较为准确识别与预测页岩岩相的目的。其中在建立模型过程中,特征选择及参数调优是提高模型准确性的有效方法。
页岩的非均质性较强,时空变化较大、有机质分布的非均质性较强、纹层结构及其组合类型差异明显,因此,盆地内富有机质页岩的分布预测以及页岩油气“甜点”段/区的评价需要明晰页岩岩相的变化特征(张晋言,2013)。利用矿物组分含量和TOC含量数据,通过成像测井拾取纹层等沉积构造,可实现岩相的测井判别(赖锦等,2023)。矿物含量可以直观地反映出岩石矿物组成成分及比例,在页岩岩相命名方面具有重要作用(沈骋等,2021),因此文中的页岩岩相命名采用将沉积构造和矿物类型相结合的方法。为了便于使用,既要解决名称过长的问题,对命名进行简化处理, 也要保留传统名称及意义(Peng and Guo, 2023)。
首先利用岩心薄片观察、成像测井、TOC含量及矿物含量识别出真实岩相,再通过建立机器学习模型的方式与常规测井建立联系,用于识别该区域无岩心的其他单井中的岩相。文中的岩相划分在沉积构造方面采用层状与纹层状来进行区分(表1)。其中,层状构造主要表现为夹杂于泥页岩段中的粉(细)砂岩,单层厚度大于1 cm。各层之间有清晰的界限,层与层之间大致平行(表1)。而页岩的纹层反映岩层内部的沉积构造特征(田瀚等,2023)。纹层也称为页岩层理,是指在沉积过程中,由于沉积环境的变化而形成的一种层状构造,其厚度通常小于1 cm(Pang et al., 2024)。纹层在页岩中常呈现出薄而平行的层状结构(车世琦,2018; 庞小娇等,2023)。纹层作为页岩独具特色的组构特征,它的发育直接导致页岩非均质性变化明显,进而对页岩生烃、储集性以及含气性等产生影响(徐传正等,2021)。在页岩储集层评价中,纹层的特征和发育程度也是重要的指标之一(Sun et al., 2022),纹层构造不仅可以反映页岩储集层的微观结构和储集性能,还会直接影响水平井体积压裂裂缝的扩展规律和压裂效果(何伟等,2021),所以该沉积构造在划分页岩岩相类型时尤为重要。
XGBoost是一种有监督的集成学习算法,它在梯度提升算法的基础上进行了一系列的改进和优化。XGBoost算法的核心思想是通过不断地拟合残差,构建一系列弱学习器,并将它们组合起来形成一个强学习器。与传统的梯度提升算法相比,XGBoost具有一些显著的优势,因为它引入了二阶导数信息(Chen and Guestrin, 2016),加快了算法的收敛速度,并能够处理稀疏数据。XGBoost算法在许多数据挖掘和机器学习任务中都取得了很好的效果(李红斌等,2022),例如分类、回归、排序等(罗钰涵等,2022;史长林等,2022)。
Boosting算法是一种集成学习框架,而XGBoost算法是基于Boosting框架的一种高效实现(刘忠宝等,2019)。在Boosting算法体系中,通常采用迭代串行的方式生成一系列模型,然后将这些模型进行线性加权相加,得到最终的集成学习器。假设已经迭代了m-1次,得到的集成模型为Fm-1。那么在接下来的一次迭代中,需要训练的是Fm,它应该是能使新生成的集成模型在训练集上损失信息最小的模型(He et al., 2016)(图2)。
2.2 XGBoost变量重要性度量
根据变量重要性的度量结果,可以进行特征选择,去除不重要或冗余的变量,从而减少模型计算的复杂度,提高模型的泛化能力(Xue et al., 2024)。
[CheS Q. 2018. Shale lithofacies identification and classification by using logging data: a case of Wufeng-Longmaxi Formation in Fuling Gas Field,Sichuan Basin. Lithologic Reservoirs, 30(1): 121-132]
[ChenD H, SuiQ L, ZhaoX J, JingD L, TengJ X, GaoY B. 2019. Geology,geochemical characteristics,and sedimentary environment of Mn-bearing carbonate from the Late Carboniferous Muhu manganese deposit in West Kunlun. Acta Sedimentologica Sinica, 37(3): 477-490]
[FuX L, MengQ A, ZhengQ, WangZ J, JinM Y, BaiY, CuiK N. 2022. Cyclicity of organic matter abundance and lithofacies paleogeography of Gulong shale in Songliao Basin. Petroleum Geology & Oilfield Development in Daqing, 41(3): 38-52]
[HeW, ChenY, LeiY X, QianC, ChenK, LinT, SongT. 2021. Analyses of the relationship between lithology and shale gas accumulation for the Wufeng Formation to Longmaxi Formation in the west of Hubei Province: a case study of the Erhongdi 1 well. Journal of China Coal Society, 46(3): 1014-1023]
[LaiJ, LiH B, ZhangM, BaiM M, ZhaoY D, FanQ X, PangX J, WangG W. 2023. Advances in well logging geology in the era of unconventional hydrocarbon resources. Journal of Palaeogeography(Chinese Edition), 25(5): 1118-1138]
[LiN, FengZ, WuH L, TianH, LiuP, LiuY M, LiuZ H, WangK W, XuB S. 2023. New advances in methods and technologies for well logging evaluation of continental shale oil in China. Acta Petrolei Sinica, 44(1): 28-44]
[LiuZ B, LiuG X, HuZ Q, FengD J, ZhuT, BianR K, JiangT, JinZ G. 2019. Lithofacies types and assemblage features of continental shale strata and their significance for shale gas exploration: a case study of the Middle and Lower Jurassic strata in the Sichuan Basin. Natural Gas Industry, 39(12): 10-21]
[LiuB, SunJ H, ZhangY Q, HeJ L, FuX F, YangL, XingJ L, ZhaoX Q. 2021. Reservoir space and enrichment model of shale oil in the first member of Cretaceous Qingshankou Formation in the Changling Sag,southern Songliao Basin,NE China. Petroleum Exploration and Development, 48(3): 521-535]
[LuoY H, GeZ J, ShenT S, HongY F, LinB, LiuZ B. 2022. The identification method and system of continental shale facies based on convolutional neural network are introduced in this paper. Chinese Patent: CN114881171A. 2024-11-29]
[23]
毛玉丹. 2023. 页岩岩相测井识别方法. 石油知识,(3): 54-55.
[24]
[MaoY D. 2023. Identification method of shale lithofacies by logging. Petroleum Knowledge,(3): 54-55]
[PangX J, WangG W, KuangL C, ZhaoF, LiH B, HanZ Y, BaiT Y, LaiJ. 2023. Logging evaluation of lithofacies and their assemblage under control of sedimentary environment: a case study of the Qingshankou Formation in Gulong sag,Songliao Basin. Journal of Palaeogeography(Chinese Edition), 25(5): 1156-1175]
[PengJ, ZengY, YangY M, YuL D, XuT Y. 2022. Discussion on classification and naming scheme of fine-grained sedimentary rocks. Petroleum Exploration and Development, 49(1): 106-115]
[PengL, WuY M, LianZ G, PengP, WangJ, SuZ, YiZ L. 2019. Features and sedimentary evolution of high-frequency sequence in continental lacustrine rift basin: example of the lower Shahejie member 3 in Jiyang Depression,Bohai Bay Basin. Oil & Gas Geology, 40(4): 789-798]
[ShenC, RenL, ZhaoJ Z, ChenM P. 2021. Division of shale lithofacies associations and their impact on fracture network formation in the Silurian Longmaxi Formation,Sichuan Basin. Oil & Gas Geology, 42(1): 98-106,123]
[ShenL, WangC Z, NingC Q, LiuY M, WangH. 2023. Well-log lithofacies classification based on machine learning for Chang-7 member in Longdong area of Ordos Basin. Petroleum Reservoir Evaluation and Development, 13(4): 525-536]
[WangM, YangJ L, WangX, LiJ B, XuL, YanY. 2023. Identification of shale lithofacies by well logs based on random forest algorithm. Earth Science, 48(1): 130-142]
[XueC Q, WuJ G, ZhangJ, ZhangS R, WuX, ChengL, ZhongJ H. 2021. The application of machine learning in shale lithofacies identification is taken as an example of Taiyuan Formation shale in Linxing area of Ordos. Annual CBM Academic Symposium in 2021, 2021-10-10]
[ZhangY L, WangG W, SongL T, BaoM, HuangY Y, LaiJ, WangS, HuangL L. 2023. Logging identification method of shale lithofacies: a study of Fengcheng Formation in Mahu Sag,Junggar Basin. Progress in Geophysics, 38(1): 393-408]
[ZhaoX Z, ZhouL H, PuX G, JinF M, ShiZ N, XiaoD Q, HanW Z, JiangW Y, ZhangW, WangH. 2019. Favorable formation conditions and enrichment characteristics of lacustrine facies shale oil in faulted lake basin: a case study of Member 2 of Kongdian Formation in Cangdong sag,Bohai Bay Basin. Acta Petrolei Sinica, 40(9): 1013-1029]
[53]
ChenT Q, GuestrinC. 2016. XGBoost: a scalable tree boosting system. The ACM SIGKDD International Conference. DOI: 10.1145/2939672.2939785.
[54]
HeJ H, DingW L, JiangZ X, LiA, WangR Y, SunY X. 2016. Logging identification and characteristic analysis of the lacustrine organic-rich shale lithofacies: a case study from the Es3L shale in the Jiyang Depression,Bohai Bay Basin, Eastern China. Journal of Petroleum Science & Engineering, 145(1): 238-255.
[55]
PangQ, HuG, HuC W, MengF S, WangB Z, ZhangJ Y. 2024. The lithofacies of sandstones interbedded with shales: implication for organic matter accumulation of Triassic deep lacustrine setting,Southern Ordos Basin. ACS Omega, 9(22): 23266-23282.
[56]
PengY X, GuoS B. 2023. Lithofacies analysis and paleosedimentary evolution of Taiyuan Formation in Southern North China Basin. Journal of Petroleum Science & Engineering,220: 111127.
[57]
SuK, YuanX, HuangY K, YuanQ, YangM H, SunJ W, LiS Y, LongX Y, LiuL, LiT W, YuanZ Q. 2023. Improved prediction of knee osteoarthritis by the machine learning model XGBoost. Indian Journal of Orthopaedics, 57(10): 1667-1677.
[58]
SunB, LiuX P, LiuJ, WangG C, ShuH L, LuoY F, LiuT, HuaZ X. 2022. The heterogeneity of lithofacies types,combination modes,and sedimentary model of lacustrine shale restricted by high-frequency sequence. Geological Journal, 57(10): 1.
[59]
WangD, ZhangY N. 2024. Coupling of SME innovation and innovation in regional economic prosperity with machine learning and IoT technologies using XGBoost algorithm. Soft Computing, 28(4): 2919-2939.
[60]
XueC Q, McBeckJ A, LuH J, YanC H, ZhongJ H, WuJ G, RenardF. 2024. Classification of shale lithofacies with minimal data: application to the early Permian shales in the Ordos Basin, China. Journal of Asian Earth Sciences, 259: 105901.
[61]
ZhengD Y, HouM C, ChenA Q, ZhongH T, QiZ, RenQ, YouJ C, WangH Y, MaC. 2022. Application of machine learning in the identification of fluvial-lacustrine lithofacies from well logs: a case study from Sichuan Basin,China. Journal of Petroleum Science and Engineering, 215: 110610.
[62]
ZhuX, ChuJ, WangK D, WuS F, YanW, ChiamK. 2021. Prediction of rockhead using a hybrid N-XGBoost machine learning framework. Journal of Rock Mechanics and Geotechnical Engineering, 13(6): 1231-1245.