基于预训练时空自注意力大模型的交通流量预测

童旭东 ,  周强 ,  顾晶晶 ,  史国梁 ,  崔鸿飞

小型微型计算机系统 ›› 2026, Vol. 47 ›› Issue (5) : 1099 -1107.

PDF (1736KB)
小型微型计算机系统 ›› 2026, Vol. 47 ›› Issue (5) : 1099 -1107. DOI: 10.20009/j.cnki.21-1106/TP.2025-0155
算法理论与人工智能

基于预训练时空自注意力大模型的交通流量预测

作者信息 +

Pre-trained Spatio-temporal Attention Model for Traffic Flow Prediction

Author information +
文章历史 +
PDF (1777K)

摘要

精准的交通流量预测是智能交通系统和智慧城市建设的关键。交通流量数据具有复杂的时空依赖性,呈现出多粒度的特征变化和动态演化规律。现有方法在捕获多粒度时空特征方面存在局限,难以充分挖掘丰富时空关联。预训练大语言模型 (Pre-trained Large Language Model,PLM)在特征表示学习方面展现出巨大潜力,但由于其预训练数据主要集中在自然语言领域,与交通流量数据存在显著的领域差异,限制了其直接应用效果。为解决上述问题,本文提出了一种基于预训练时空自注意力模型的交通流量预测框架(Pre-trained Spatio-Temporal Attention Model for Traffic Flow Prediction,PSTAM)。首先通过创新的双路径激活机制解决领域差异问题,实现特征对齐与融合;其次利用预训练策略对多粒度时空特征进行深度建模,增强对复杂时空依赖关系的捕获能力。实验结果表明,PSTAM 在多个标准交通数据集上优于现有方法,为智能交通系统的实时决策提供了可靠支持。

Abstract

Precise traffic flow prediction is fundamental to the development of intelligent transportation systems and smart city initia- tives.Traffic flow data is characterized by complex spatio-temporal dependencies,manifesting multi-scale feature variations and dy- namic evolutionary patterns.Existing methodologies exhibit significant limitations in capturing these multi-scale spatio-temporal fea- tures,thus failing to adequately exploit the rich spatio-temporal correlations inherent in traffic data.Pre-trained Large Language Models (PLMs)have demonstrated considerable potential in feature representation learning;however,their direct application is substantially constrained by the domain disparity between their pre-training data,which is predominantly concentrated in natural language domains, and the distinctive characteristics of traffic flow data.To address these challenges,this paper proposes a Pre-trained Spatio-Temporal Attention Model for Traffic Flow Prediction(PSTAM).The framework incorporates two primary innovations:firstly,an innovative du- al-pathway activation mechanism to resolve domain disparities,facilitating effective feature alignment and fusion;secondly,advanced pre-training strategies for comprehensive modeling of multi-scale spatio-temporal features,substantially enhancing the capacity to cap- ture complex spatio-temporal dependencies.Experimental evaluations demonstrate that PSTAM consistently outperforms state-of-the-art methods across multiple benchmark traffic datasets,providing robust support for real-time decision-making processes in intelligent transportation systems.

关键词

交通流量预测 / 预训练大语言模型 / 双激活机制 / 多粒度 / 领域适配

Key words

traffic flow prediction / pre-trained large language models / dual activation mechanism / multi-scale / domain adaptation

引用本文

引用格式 ▾
童旭东,周强,顾晶晶,史国梁,崔鸿飞. 基于预训练时空自注意力大模型的交通流量预测[J]. 小型微型计算机系统, 2026, 47(5): 1099-1107 DOI:10.20009/j.cnki.21-1106/TP.2025-0155

登录浏览全文

4963

注册一个新账户 忘记密码

参考文献

[1]

Box G E P, Pierce D A. Distribution of residual autocorrelations in autoregressive-integrated moving average time series models[J]. Journal of the American statistical Association, 1970, 65 (332): 1509-1526.

[2]

Sims C A. Macroeconomics and reality[J]. Econometrica:Journal of the Econometric Society, 1980, 48(1):1-48.

[3]

Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural Computation, 1997, 9(8):1735-1780.

[4]

Bai S, Kolter J Z, Koltun V. An empirical evaluation of generic convolutional and recument networks for sequence modeling[J]. arXiv preprint arXiv:1803.01271,2018.

[5]

Li Y, Yu R, Shahabi C, et al. Diffusion convolutional recurrent neu- ral network:data-driven traffic forecasting[J]. arXiv preprint arX- iv:1707.01926,2017.

[6]

Yu B, Yin H, Zhu Z. Spatio-temporal graph convolutional net- works ;a deep learning framework for traffic forecasting[J]. arXiv preprint arXiv:1709.04875,2017.

[7]

Guo S, Lin Y, Feng N, et al. Attention based spatial-temporal graph convolutional networks for traffic flow forecasting[C]// Proceed- ings of the AAAI Conference on Artificial Intelligence, 2019: 922-929.

[8]

Wu Z, Pan S, Long G, et al. Graph wavenet for deep spatial-tempo- ral graph modeling[J]. arXiv preprint arXiv:1906. 00121,2019..

[9]

Song C, Lin Y, Guo S, et al. Spatial-temporal synchronous graph convolutional networks: a new framework for spatial-temporal net- work data forecasting[C]// Proceedings of the AAAI Conference on Artificial Intelligence, 2020:914-921.

[10]

Li M, Zhu Z. Spatial-temporal fusion graph neural networks for traffic flow forecasting[C]// Proceedings of the AAAI Conference on Artificial Intelligence, 2021:4189-4196.

[11]

Fang Z, Long Q, Song G, et al. Spatial-temporal graph ode net- works for traffic flow forecasting[C]// Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Min- ing, 2021:364-373.

[12]

Lan S, Ma Y, Huang W, et al. Dstagnn:dynamic spatial temporal a ware graph neural network for traffic flow forecasting[C]// Interna- tional Conference on Machine Learning, PMLR,2022:11906-11917.

[13]

Guo S, Lin Y, Wan H, et al. Learning dynamics and heterogeneity of spatial-temporal graph data for traffic forecasting[J]. IEEE Transactions on Knowledge and Data Engineering, 2021, 34(11): 5415-5428.

[14]

Bai L, Yao L, Li C, et al. Adaptive graph convolutional recurrent network for traffic forecasting[J]. Advances in Neural Information Processing Systems, 2020, 33:17804-17815,doi:10.48550/arXiv.2007.02842.

[15]

Chen Y, Segovia I, Gel Y R. Z-GCNETs:time zigzags at graph convolutional networks for time series forecasting[C]// Interna- tional Conference on Machine Learning, PMLR,2021:1684-1694.

[16]

Jiang J, Han C, Zhao W X, et al. Pdformer:propagation delay-a- ware dynamic long-range transformer for traffic flow prediction[C]// Proceedings of the AAAI Conference on Artificial Intelli- gence, 2023:4365-4373.

[17]

Liu H, Dong Z, Jiang R, et al. Spatio-temporal adaptive embedding makes vanilla transformer sota for traffic forecasting[C]// Pro- ceedings of the 32 nd ACM International Conference on Information and Knowledge Management, 2023:4125-4129.

[18]

Xue H, Salim F D. Promptcast:a new prompt-based learning para- digm for time series forecasting[J]. IEEE Transactions on Knowl- edge and Data Engineering, 2023, 36(11):6851-6864.

[19]

Zhou T, Niu P, Sun L, et al. One fits all:power general time series analysis by pretrainedlm[J]. Advances in Neural Information Pro- cessing Systems, 2023, 36:43322-43355,doi:10.48550/arXiv.2302.11939.

[20]

Liu C, Yang S, Xu Q, et al. Spatial-temporal large language model for traffic prediction[C]// 25th IEEE International Conference on Mobile Data Management(MDM), 2024:31-40.

[21]

Chen Y, Wang X, Xu G. Gatgpt:a pre-trained large language model with graph attention network for spatiotemporal imputation[J]. arXiv preprint arXiv:2311.14332,2023.

[22]

Liu Y, Zhang H, Li C, et al. Timer:generative pre-trained trans- formers are large time series models[J]. arXiv preprint arXiv: 2402.02368,2024.

[23]

Gao H, Jiang R, Dong Z, et al. Spatial-temporal-decoupled masked pre-training for spatiotemporal forecasting[J]. arXiv preprint arX- iv:2312.00516,2023.

[24]

Hu E J, Shen Y, Wallis P, et al. Lora:low-rank adaptation of large language models[C]// Proceedings of the International Conference on Learning Representations(ICLR), 2022.

[25]

Radford A, Wu J, Child R, et al. Language models are unsupervised multitask learners[J]. OpenAI Blog, 2019, 1(8):9.

[26]

Vaswani A, Shazeer N, Pammar N, et al. Attention is all you need[C]// Advances in Neural Information Processing Systems, 2017: 5998-6008.

基金资助

国家白然科学基金面上项目(62072235)

江苏省自然科学基金青年项目(BK20241402)

AI Summary AI Mindmap
PDF (1736KB)

0

访问

0

被引

详细

导航
相关文章

AI思维导图

/