Addressing the alert delay issue in existing advanced persistent threat (APT) detection methods based on provenance graphs, a real-time detection scheme named StreamTGN is proposed, which relies on dynamic tracking of system entity states. An operational-level provenance graph with higher information hierarchy is first constructed. Subsequently, the rationality of system activities is dynamically analyzed through the state evolution of entities during system operation. Finally, abnormal behaviors potentially related to APT attacks are detected using dynamically set anomaly thresholds. Experimental results demonstrate that StreamTGN effectively addresses the “low-and-slow” behavioral characteristics of APT attacks while exhibiting stronger detection stability and robustness compared to existing approaches.
3)节点状态更新。为获取经历操作后的节点状态向量,此步骤被建模为一个序列预测任务,引入了门控循环单元(Gated Recurrent Unit, GRU)[23]作为核心模型。相较LSTM、BiLSTM等其他热门循环神经网络模型,GRU在同等推理能力下所需的参数规模更小且推理效率更高,更契合StreamTGN的实时检测需求。状态更新的过程如式(4)所示:
PASQUIERT, HANX Y, MOYERT, et al. Runtime analysis of whole-system provenance[C]∥Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. New York, USA: ACM, 2018:1601-1616.
[3]
HOSSAINM N, MILAJERDIS M, WANGJ, et al. SLEUTH: real-time attack scenario reconstruction from COTS audit data[C]∥Proceedings of the 26th USENIX Security Symposium. Berkeley, USA: USENIX Association, 2017:487-504.
[4]
MILAJERDIS M, GJOMEMOR, ESHETEB, et al. HOLMES: real-time APT detection through correlation of suspicious information flows[C]∥Proceedings of the 2019 IEEE Symposium on Security and Privacy. Piscataway, USA: IEEE, 2019:1137-1152.
[5]
MILAJERDIS M, ESHETEB, GJOMEMOR, et al. POIROT: aligning attack behavior with kernel audit records for cyber threat hunting[C]∥Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. New York,USA: ACM, 2019:1795-1812.
[6]
LIUF C, WENY, ZHANGD X, et al. Log2vec: a heterogeneous graph embedding based approach for detecting cyber threats within enterprise[C]∥Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. New York, USA: ACM, 2019:1777-1794.
[7]
MANZOORE, MILAJERDIS M, AKOGLUL. Fast memory-efficient anomaly detection in streaming heterogeneous graphs[C]∥Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2016:1035-1044.
[8]
HANX Y, PASQUIERT, BATESA, et al. UNICORN: runtime provenance-based detector for advanced persistent threats[DB/OL]. (2020-01-06)[2025-03-22].
[9]
JIAZ A, XIONGY, NANY H, et al. MAGIC: detecting advanced persistent threats via masked graph representation learning[DB/OL]. (2023-10-15)[2025-03-22].
[10]
BORDESA, USUNIERN, GARCIA-DURANA, et al. Translating embeddings for modeling multi-relational data[C]∥Proceedings of the Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2013:1-9.
[11]
REHMAN MUR, AHMADIH, HASSAN WUL. FLASH: a comprehensive approach to intrusion detection via provenance graph representation learning[C]∥Proceedings of the 2024 IEEE Symposium on Security and Privacy. Piscataway, USA: IEEE, 2024:3552-3570.
[12]
DONGF, WANGL, NIEX, et al. DISTDET: A cost-effective distributed cyber threat detection system[C]∥Proceedings of the 32nd USENIX Security Symposium. Berkeley, USA: USENIX Association, 2023:6575-6592.
[13]
WANGQ, HASSANW U, LID, et al. You are what you do: hunting stealthy malware via data provenance analysis[C]∥Proceedings of the 2020 Network and Distributed System Security Symposium. Reston, USA: Internet Society, 2020. DOI: 10.14722/ndss.2020.24167 .
[14]
ALSAHEELA, NAN Y, MAS, et al. ATLAS: a sequence-based learning approach for attack investigation[C]∥Proceedings of the 30th USENIX Security Symposium. Berkeley, USA: USENIX Association, 2021:3005-3022.
[15]
ZENGYJ, WANGX, LIUJ H, et al. SHADEWATCHER: recommendation-guided cyber threat analysis using system audit records[C]∥Proceedings of the 2022 IEEE Symposium on Security and Privacy. Piscataway, USA: IEEE, 2022:489-506.
[16]
HANX Y, YUX, PASQUIERT, et al. SIGL: securing software installations through deep graph learning[DB/OL]. (2020-08-26)[2025-03-22].
[17]
WANGS, WANGZ L, ZHOUT, et al. THREATRACE: detecting and tracing host-based threats in node level through provenance graph learning[J]. IEEE Transactions on Information Forensics and Security, 2022,17:3972-3987.
[18]
GOYALA, HANX Y, WANGG, et al. Sometimes, you aren’t what you do: mimicry attacks against provenance graph host intrusion detection systems[C]∥Proceedings of the 2023 Network and Distributed System Security Symposium. San Diego, USA: Internet Society, 2023. DOI: 10.14722/ndss.2023.24207 .
[19]
LEQ, MIKOLOVT. Distributed representations of sentences and documents[C]∥Proceedings of the International Conference on Machine Learning. New York, USA: PMLR, 2014:1188-1196.
[20]
HAMILTONW L, YINGR, LESKOVECJ. Inductive representation learning on large graphs[DB/OL]. (2017-06-07)[2025-03-22].
[21]
ROSSIE, CHAMBERLAINB, FRASCAF, et al. Temporal graph networks for deep learning on dynamic graphs[DB/OL]. (2020-06-18)[2025-03-22].
[22]
XUD, RUANC W, KORPEOGLUE, et al. Inductive representation learning on temporal graphs[DB/OL]. (2020-02-19)[2025-03-22].
[23]
CHUNGJ, GULCEHREC, CHOK, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[DB/OL]. (2014-12-11)[2025-03-22].
[24]
CHENGZ J, LVQ J, LIANGJ Y, et al. Kairos: practical intrusion detection and investigation using whole-system provenance[C]∥Proceedings of the 2024 IEEE Symposium on Security and Privacy. Piscataway, USA: IEEE, 2024:3533-3551.