Affected by network equipment update and communication protocol upgrade, the distribution, category and attribute of network data have unpredictable drift characteristics, subsequently impairing the classification precision of machine learning-based network data classification models. To solve this problem, a class concept drift network data detection and classification algorithm based on prototype learning is proposed. The network data is addressed from the time series perspective, harnessing a network equipped with an attention mechanism to distill spatiotemporal features from the data. Drawing on the principles of prototype learning, the distances between samples and prototypes are utilized for classification purposes. In instances of class concept drift, a suitable threshold is established to identify novel classes, and the mean values are employed to refresh the prototype matrix. Experiment result shows that the utilization of prototype matching for classification not only yields higher accuracy than traditional softmax classifiers, but also can effectively detect the drift when the data has class concept drift, and has better classification performance on the drift data.
SCHLIMMERJ C, GRANGERR H. Incremental learning from noisy data[J]. Machine Learning, 1986,1(3):317-354.
[2]
ZLIOBAITEI. Concept drift over geological times: predictive modeling baselines for analyzing the mammalian fossil record[J]. Data Mining and Knowledge Discovery, 2019,33(3):773-803.
[3]
PACHECOF, EXPOSITOE, GINESTEM, et al. Towards the deployment of machine learning solutions in network traffic classification: a systematic survey[J]. IEEE Communications Surveys Tutorials, 2019,21(2):1988-2014.
KELLYM G, HANDD J, ADAMSN M. The impact of changing populations on classifier performance[C]∥Proceedings of The Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Diego, USA: ACM, 1999:367-371.
[6]
AGGARWALC C. A survey of stream classification algorithms[M]∥AGGARWAL C C. Data classification: algorithms and applications. New York, USA: CRC, 2014:245-274.
[7]
GABERM M, ZASLAVSKYA B, KRISHNASS. A survey of classification methods in data streams[M]∥AGGARWAL C C. Data streams: models and algorithms. Boston, USA: Springer, 2007:39-59.
[8]
DITZLERG, POLIKARR. Incremental learning of concept drift from streaming imbalanced data[J]. IEEE Transactions on Knowledge and Data Engineering, 2013,25(10):2283-2301.
[9]
HAQUEA, KHANL, BARONM. SAND: semi-supervised adaptive novel class detection and classification over data stream[C]∥Proceedings of the AAAI Conference on Artificial Intelligence, 2016. DOI: 10.1609/aaai.v30i1.10283 .
[10]
ZHENGX L, LIP P, HUX G, et al. Semi-supervised classification on data streams with recurring concept drift and concept evolution[J]. Knowledge-Based Systems, 2021,215:106749.
[11]
RYANS, CORIZZOR, KIRINGAI, et al. Deep learning versus conventional learning in data streams with concept drifts[C]∥2019 18th IEEE International Conference on Machine Learning and Applications. Boca Raton, USA: IEEE, 2019:1306-1313.
[12]
YUANL H, LIH, XIAB H, et al. Recent advances in concept drift adaptation methods for deep learning[C]∥Proceedings of The Thirty-First International Joint Conference on Artificial Intelligence. Vienna, Austria: International Joint Conferences on Artificial Intelligence Organization, 2022:5654-5661.
[13]
ASHFAHANIA, PRATAMAM, LUGHOFERE, et al. DEVDAN: deep evolving denoising autoencoder[J]. Neurocomputing, 2020,390:297-314.
[14]
PRATAMAM, DE CARVALHOM, XIER, et al. ATL: autonomous knowledge transfer from many streaming processes[C]∥Proceedings of the 28th ACM International Conference on Information and Knowledge Management. New York, USA: ACM, 2019:269-278.
[15]
LECUNY, BENGIOY, HINTONG. Deep learning[J]. Nature, 2015,521(7553):436-444.