The majority of APT detection studies currently in existence concentrate on data from partial APT attack phases, and contextual correlation analysis of all APT attack phases is absent. To address the above issue, a multivariate time-series dataset including all APT phases is created by combining host-side and network-side data. An APT attack sequence detection method based on feature selection and the two-tower Transformer model is proposed. Firstly, a feature optimization module is employed to select critical feature subsets as the input. Secondly, a two-tower structure is utilized to capture associated information between states at two adjacent time points of APT attack sequences from the time dimension, and to explore implicit relationships between feature variables from the feature dimension. Finally, the gate structure is introduced to connect and merge the weights of the two-tower, and the implicit information of APT attack sequence in time and feature dimensions is adaptively fused to achieve the purpose of improving the detection performance. Experimental results demonstrate that compared with recurrent neural networks (RNN), long short-term memory (LSTM) and Transformer models, superior performance is achieved by using the proposed method, with a detection accuracy of 95.42%.
通过计算随机森林算法中多棵分类回归树(Classification and Regression Tree, CART)中各节点的不纯度减少量来进行特征重要度排序。随机森林算法的基学习器是CART决策树,决策树算法的核心是使用不纯度作为选择分类属性的指标,而CART决策树采用基尼指数作为不纯度的评判依据。基尼指数是对不确定性的衡量,可表示为
ALSHAMRANIA, MYNENIS, CHOWDHARYA, et al. A survey on advanced persistent threats: techniques, solutions, challenges, and research opportunities[J]. IEEE Communications Surveys & Tutorials, 2019,21(2):1851-1877.
HUTCHINSE, CLOPPERTM, AMINR. Intelligence-driven computer network defense informed by analysis of adversary campaigns and intrusion kill chains[J]. Leading Issues in Information Warfare & Security Research, 2011,1(1):No.80.
[4]
The MITRE Corporation. ATT&CK[EB/OL]. [2024-02-10].
[5]
XUANC D. Detecting APT attacks based on network traffic using machine learning[J]. Journal of Web Engineering, 2021,20(1):171-190.
[6]
CHUANB L J, SINGHM M, SHARIFFA R M. APTGuard: advanced persistent threat (APT) detections and predictions using Android smartphone[C]∥Proceedings of the Fifth International Conference on Computational Science and Technology 2018. Singapore: Springer, 2019:545-555.
[7]
NIUW N, XIEJ, ZHANGX S, et al. HTTP-based APT malware infection detection using URL correlation analysis[J]. Security and Communication Networks, 2021,2021(1):No.66533386.
[8]
ALREHAILIM, ALSHAMRANIA, ESHMAWIA. A hybrid deep learning approach for advanced persistent threat attack detection[C]∥Proceedings of the 5th International Conference on Future Networks & Distributed Systems. New York, USA: ACM, 2021:78-86.
[9]
YANG H, LIQ, GUOD, et al. AULD: large scale suspicious DNS activities detection via unsupervised learning in advanced persistent threats[J]. Sensors, 2019,19(14):3180.
LIUM H, RENS Q, MAS Y, et al. Gated transformer networks for multivariate time series classification[DB/OL]. (2021-03-26)[2024-02-10].
[16]
VASWANIA, SHAZEERN, PARMARN, et al. Attention is all you need[C]∥Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, USA: Curran Associates Inc., 2017: 6000-6010.
[17]
MYNENIS, CHOWDHARYA, SABURA, et al. DAPT 2020-constructing a benchmark dataset for advanced persistent threats[C]∥Proceedings of the International Workshop on Deployable Machine Learning for Security Defense. Cham, Switzerland: Springer, 2020:138-163.
[18]
SHARAFALDINI, LASHKARIA H, GHORBANIA A. Toward generating a new intrusion detection dataset and intrusion traffic characterization[C]∥Proceedings of the 4th International Conference on Information Systems Security and Privacy. Cham, Switzerland: Springer, 2018:108- 116.