To address the issue that the insufficient generalization ability of the model due to the data distribution difference between the target domain and the source domain in network traffic detection leads to declined detection performance, a non-parametric traffic detection method with enhanced domain adaptation ability is proposed. Specifically, during the model fine-tuning stage, a dictionary is constructed by integrating the deep representations of training samples with their labels. In the detection phase, the dictionary is queried to correct the prediction results of the parameterized detection model based on the probability distribution of the k most similar samples to the sample under test. This approach achieves stable and robust detection results without increasing training sample size or model training overhead and possesses high interpretability. Extensive experiments on cross-domain scenarios with imbalanced distributions (e.g., USTC-TFC2016 dataset) demonstrate that the model’s adaptation ability and detection performance are significantly enhanced by the method, and the effectiveness and robustness of the method are further verified under different new domain data distributions.
现有网络流量检测域适应方法,一方面,基于分布对齐,通过优化模型结构或训练策略提取出对域不变的特征表示,缩小源域和目标域之间的分布差异。Yang等提出跨域引导与多学习器协同监督训练策略,以源域知识提升模型在目标域的适应性和泛化能力[16],但大规模数据下计算资源与效率面临挑战。Ning等先在源域预训练模型,再迁移特征提取器至目标域并微调,微调时以最大平均差异(Maximum Mean Discrepancy, MMD)损失最小化域间分布差异,使模型学习域不变特征表示,再使用目标域标记数据训练新的分类器实现流量分类[17],但面对极端不均衡数据集时,模型性能可能下降。Taghiyarrenani等则采用MMD度量最小化源域与目标域中共享应用的分布差异,并同步最小化应用内样本、最大化应用间样本的距离,实现存在非共享应用情形下的跨域流量分类[18]。然而,MMD仅对齐分布的均值差异,当源域与目标域数据分布差异显著时,其校正能力受限,从而削弱域适应性能。
SREYP, ZHANGY H, KANAMORIT. Open-world learning under dataset shift[C]∥Proceedings of the 2024 IEEE Conference on Artificial Intelligence. Piscataway, USA: IEEE, 2024:1040-1042.
[3]
LIANGJ, HUD P, FENGJ S. Domain adaptation with auxiliary target domain-oriented classifier[C]∥Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, USA: IEEE, 2021:16632-16642.
[4]
LIANGJ, HER, TANT N. A comprehensive survey on test-time adaptation under distribution shifts[J]. International Journal of Computer Vision, 2025:133(1):31-64.
[5]
NATH KUNDUJ, VENKATN, RAHULM V,et al. Universal source-free domain adaptation[C]∥Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, USA: IEEE, 2020:4543-4552.
[6]
LIR, JIAOQ F, CAOW M, et al. Model adaptation: unsupervised domain adaptation without source data[C]∥Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, USA: IEEE, 2020:9638-9647.
[7]
ZHOUK Y, LIUZ W, QIAOY, et al. Domain generalization: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023,45(4):4396-4415.
[8]
WANGW, ZHONGZ, WANGW J, et al. Dynamically instance-guided adaptation: a backward-free approach for test-time domain adaptive semantic segmentation[C]∥Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, USA: IEEE, 2023:24090-24099.
[9]
RUANX Q, TANGW. Fully test-time adaptation for object detection[C]∥Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Workshops. Piscataway, USA: IEEE, 2024:1038-1047.
[10]
KOUWW M, LOOGM. A review of domain adaptation without target labels[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021,43(3):766-785.
[11]
YOSINSKIJ, CLUNEJ, BENGIOY, et al. How transferable are features in deep neural networks?[EB/OL]. (2014-11-06)[2025-01-22].
[12]
ZHANGM, MARKLUNDH, DHAWANN, et al. Adaptive risk minimization: learning to adapt to domain shift[C]∥Proceedings of the Neural Information Processing Systems. Cambridge, USA: NeurIPS, 2021. DOI: 10.48550/ arXiv.2007.02931 .
[13]
KHANDELWALU, LEVYO, JURAFSKYD, et al. Generalization through memorization: nearest neighbor language models[EB/OL]. (2019-11-01)[2025-01-22]. http s://doi.org/10.48550/arXiv.1911.00172.
[14]
HALDERR K, UDDINM N, UDDINM A, et al. Enhancing k-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications[J]. Journal of Big Data, 2024,11(1):No.113.
YANGL Y, GAOM F, CHENZ Y, et al. Burn after reading: online adaptation for cross-domain streaming data[C]∥Proceedings of the Computer Vision-ECCV 2022. Cham, Switzerland: Springer, 2022:404-422.
[17]
NINGJ H, GUIG, WANGY, et al. Malware traffic classification using domain adaptation and ladder network for secure industrial Internet of Things[J]. IEEE Internet of Things Journal, 2022,9(18):17058-17069.
[18]
TAGHIYARRENANIZ, FARSIH. Domain adaptation with maximum margin criterion with application to network traffic classification[C]∥Proceedings of the Machine Learning and Principles and Practice of Knowledge Discovery in Databases. Cham, Switzerland: Springer, 2023:159-169.
[19]
WEIN, YINL H, ZHOUX M, et al. A feature enhancement-based model for the malicious traffic detection with small-scale imbalanced dataset[J]. Information Sciences, 2023,647:No.119512.
[20]
TONGV, DAO C, TRANH A, et al. Encrypted traffic classification through deep domain adaptation network with smooth characteristic function[J]. IEEE Transactions on Network and Service Management, 2025,22(1):331-343.
[21]
HUANGY S B, LIUD G, ZHONGZ X, et al. kNN-adapter: efficient domain adaptation for black-box language models[EB/OL]. (2023-02-23)[2025-01-22]. http s://doi.org/10.48550/arXiv.2302.10879.
[22]
WANGD X, FANK, CHENB X, et al. Efficient cluster-based k-nearest-neighbor machine translation-nearest-neighbor machine translation[C]∥Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2022:2175-2187.
[23]
SHARAFALDINI, HABIBI LASHKARIA, GHORBANIA A. Toward generating a new intrusion detection dataset and intrusion traffic characterization[C]∥Proceedings of the 4th International Conference on Information Systems Security and Privacy. Funchal, Portugal: SCITEPRESS-Science and Technology Publications, 2018:108-116.
[24]
WANGW, ZHUM, ZENGX W, et al. Malware traffic classification using convolutional neural network for representation learning[C]∥Proceedings of the 2017 International Conference on Information Networking. Piscataway, USA: IEEE, 2017:712-717.
[25]
DRAPER-GILG, LASHKARIA H, MAMUNM S I, et al. Characterization of encrypted and VPN traffic using time-related features[C]∥Proceedings of the 2nd International Conference on Information Systems Security and Privacy. Rome, Italy: SCITEPRESS-Science and Technology Publications, 2016:407-414.
[26]
SUNJ W, ZHANGB, LIH Y, et al. T-Sanitation: contrastive masked auto-encoder-based few-shot learning for malicious traffic detection[J]. The Journal of Supercomputing, 2025,81(5):No.727.