基于预训练时空自注意力大模型的交通流量预测

童旭东; 周强; 顾晶晶; 史国梁; 崔鸿飞

doi:10.20009/j.cnki.21-1106/TP.2025-0155

小型微型计算机系统 ›› 2026, Vol. 47 ›› Issue (5) : 1099 -1107. DOI: 10.20009/j.cnki.21-1106/TP.2025-0155

算法理论与人工智能

基于预训练时空自注意力大模型的交通流量预测

童旭东, 周强, 顾晶晶, 史国梁, 崔鸿飞

作者信息 +

Pre-trained Spatio-temporal Attention Model for Traffic Flow Prediction

TONG Xudong, ZHOU Qiang, GU Jingjing, SHI Guoliang, CUI Hongfei

Author information +

文章历史 +

摘要

精准的交通流量预测是智能交通系统和智慧城市建设的关键.交通流量数据具有复杂的时空依赖性,呈现出多粒度的特征变化和动态演化规律.现有方法在捕获多粒度时空特征方面存在局限,难以充分挖掘丰富时空关联.预训练大语言模型(Pre-trained Large Language Model,PLM)在特征表示学习方面展现出巨大潜力,但由于其预训练数据主要集中在自然语言领域,与交通流量数据存在显著的领域差异,限制了其直接应用效果.为解决上述问题,本文提出了一种基于预训练时空自注意力模型的交通流量预测框架(Pre-trained Spatio-Temporal Attention Model for Traffic Flow Prediction,PSTAM).首先通过创新的双路径激活机制解决领域差异问题,实现特征对齐与融合;其次利用预训练策略对多粒度时空特征进行深度建模,增强对复杂时空依赖关系的捕获能力.实验结果表明,PSTAM在多个标准交通数据集上优于现有方法,为智能交通系统的实时决策提供了可靠支持.

Abstract

Precise traffic flow prediction is fundamental to the development of intelligent transportation systems and smart city initiatives.Traffic flow data is characterized by complex spatio-temporal dependencies,manifesting multi-scale feature variations and dynamic evolutionary patterns.Existing methodologies exhibit significant limitations in capturing these multi-scale spatio-temporal features,thus failing to adequately exploit the rich spatio-temporal correlations inherent in traffic data.Pre-trained Large Language Models(PLMs)have demonstrated considerable potential in feature representation learning;however,their direct application is substantially constrained by the domain disparity between their pre-training data,which is predominantly concentrated in natural language domains,and the distinctive characteristics of traffic flow data.To address these challenges,this paper proposes a Pre-trained Spatio-Temporal Attention Model for Traffic Flow Prediction(PSTAM).The framework incorporates two primary innovations:firstly,an innovative dual-pathway activation mechanism to resolve domain disparities,facilitating effective feature alignment and fusion;secondly,advanced pre-training strategies for comprehensive modeling of multi-scale spatio-temporal features,substantially enhancing the capacity to capture complex spatio-temporal dependencies.Experimental evaluations demonstrate that PSTAM consistently outperforms state-of-the-art methods across multiple benchmark traffic datasets,providing robust support for real-time decision-making processes in intelligent transportation systems.

关键词

交通流量预测 / 预训练大语言模型 / 双激活机制 / 多粒度 / 领域适配

Key words

traffic flow prediction / pre-trained large language models / dual activation mechanism / multi-scale / domain adaptation

引用本文

引用格式 ▾

童旭东, 周强, 顾晶晶, 史国梁, 崔鸿飞. 基于预训练时空自注意力大模型的交通流量预测[J]. 小型微型计算机系统, 2026, 47(5): 1099-1107 DOI:10.20009/j.cnki.21-1106/TP.2025-0155

登录浏览全文

4963

注册一个新账户忘记密码

参考文献

[1] Box G E P,Pierce D A.Distribution of residual autocorrelations in autoregressive-integrated moving average time series models[J].Journal of the American statistical Association,1970,65(332):1509-1526.
[2] Sims C A.Macroeconomics and reality[J].Econometrica:Journal of the Econometric Society,1980,48(1):1-48.
[3] Hochreiter S,Schmidhuber J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[4] Bai S,Kolter J Z,Koltun V.An empirical evaluation of generic convolutional and recurrent networks for sequence modeling[J].arXiv preprint arXiv:1803.01271,2018.
[5] Li Y,Yu R,Shahabi C,et al.Diffusion convolutional recurrent neural network:data-driven traffic forecasting[J].arXiv preprint arXiv:1707.01926,2017.
[6] Yu B,Yin H,Zhu Z.Spatio-temporal graph convolutional networks:a deep learning framework for traffic forecasting[J].arXiv preprint arXiv:1709.04875,2017.
[7] Guo S,Lin Y,Feng N,et al.Attention based spatial-temporal graph convolutional networks for traffic flow forecasting[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2019:922-929.
[8] Wu Z,Pan S,Long G,et al.Graph wavenet for deep spatial-temporal graph modeling[J].arXiv preprint arXiv:1906.00121,2019.
[9] Song C,Lin Y,Guo S,et al.Spatial-temporal synchronous graph convolutional networks:a new framework for spatial-temporal network data forecasting[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2020:914-921.
[10] Li M,Zhu Z.Spatial-temporal fusion graph neural networks for traffic flow forecasting[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2021:4189-4196.
[11] Fang Z,Long Q,Song G,et al.Spatial-temporal graph ode networks for traffic flow forecasting[C]//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining,2021:364-373.
[12] Lan S,Ma Y,Huang W,et al.Dstagnn:dynamic spatial-temporal aware graph neural network for traffic flow forecasting[C]//International Conference on Machine Learning,PMLR,2022:11906-11917.
[13] Guo S,Lin Y,Wan H,et al.Learning dynamics and heterogeneity of spatial-temporal graph data for traffic forecasting[J].IEEE Transactions on Knowledge and Data Engineering,2021,34(11):5415-5428.
[14] Bai L,Yao L,Li C,et al.Adaptive graph convolutional recurrent network for traffic forecasting[J].Advances in Neural Information Processing Systems,2020,33:17804-17815,doi:10.48550/arXiv.2007.02842.
[15] Chen Y,Segovia I,Gel Y R.Z-GCNETs:time zigzags at graph convolutional networks for time series forecasting[C]//International Conference on Machine Learning,PMLR,2021:1684-1694.
[16] Jiang J,Han C,Zhao W X,et al.Pdformer:propagation delay-aware dynamic long-range transformer for traffic flow prediction[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2023:4365-4373.
[17] Liu H,Dong Z,Jiang R,et al.Spatio-temporal adaptive embedding makes vanilla transformer sota for traffic forecasting[C]//Proceedings of the 32nd ACM International Conference on Information and Knowledge Management,2023:4125-4129.
[18] Xue H,Salim F D.Promptcast:a new prompt-based learning paradigm for time series forecasting[J].IEEE Transactions on Knowledge and Data Engineering,2023,36(11):6851-6864.
[19] Zhou T,Niu P,Sun L,et al.One fits all:power general time series analysis by pretrained lm[J].Advances in Neural Information Processing Systems,2023,36:43322-43355,doi:10.48550/arXiv.2302.11939.
[20] Liu C,Yang S,Xu Q,et al.Spatial-temporal large language model for traffic prediction[C]//25th IEEE International Conference on Mobile Data Management(MDM),2024:31-40.
[21] Chen Y,Wang X,Xu G.Gatgpt:a pre-trained large language model with graph attention network for spatiotemporal imputation[J].arXiv preprint arXiv:2311.14332,2023.
[22] Liu Y,Zhang H,Li C,et al.Timer:generative pre-trained transformers are large time series models[J].arXiv preprint arXiv:2402.02368,2024.
[23] Gao H,Jiang R,Dong Z,et al.Spatial-temporal-decoupled masked pre-training for spatiotemporal forecasting[J].arXiv preprint arXiv:2312.00516,2023.
[24] Hu E J,Shen Y,Wallis P,et al.Lora:low-rank adaptation of large language models[C]//Proceedings of the International Conference on Learning Representations(ICLR),2022.
[25] Radford A,Wu J,Child R,et al.Language models are unsupervised multitask learners[J].OpenAI Blog,2019,1(8):9.
[26] Vaswani A,Shazeer N,Parmar N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems,2017:5998-6008.