一种基于动态融合局部异常因子的时序数据快速清洗方法

郝福忠, 杨宇方, 姬哲, 张静, 王军义

自动化技术与应用 ›› 2026, Vol. 45 ›› Issue (6) : 135 -139.

PDF (1412KB)
自动化技术与应用 ›› 2026, Vol. 45 ›› Issue (6) : 135 -139. DOI: 10.20033/j.1003-7241.(2026)06-0135-05
计算机通信技术

一种基于动态融合局部异常因子的时序数据快速清洗方法

    郝福忠1, 杨宇方1, 姬哲1, 张静2, 王军义2
作者信息 +

A fast cleaning method of temporal data based on dynamic fusion of local abnormal factors

    Hao Fuzhong1, Yang Yufang1, Ji Zhe1, Zhang Jing2, Wang Junyi2
Author information +
文章历史 +
PDF (1444K)

摘要

针对现有数据清洗方法存在清洗不彻底、效率较低的问题,提出一种基于动态融合局部异常因子的时序数据快速清洗方法。设置多个不同的k值分别计算各数据点的局部异常因子,通过归一化处理消除量纲影响,并结合比值加权计算得到综合局部异常因子,采用动态融合策略对数据点异常程度进行综合评估,实现异常数据的精准识别。利用布谷鸟算法对K-means聚类算法的初始聚类中心进行优化,避免传统K-means易陷入局部最优的问题。将异常数据删除后形成缺失数据集,利用优化后的K-means算法对数据进行聚类,找出包含缺失位置的簇,计算该簇内数据的均值作为原始真实值的估算,并以此替代异常数据,完成数据修复与清洗任务。测试结果表明,该方法清洗全面指数更高、清洗数据均方误差更小、清洗时间开销更少,表明该方法能够快速、全面、准确地完成时序数据中的异常数据清洗任务。

Abstract

A research method for rapid cleaning of time-series data based on dynamic fusion of local abnormal factors is proposed to address the problems of incomplete cleaning and low efficiency in existing data cleaning methods. Set multiple different k values to calculate the local anomaly factors of each data point, eliminate the influence of dimensionality through normalization, and combine ratio weighting to calculate the comprehensive local anomaly factors. Use dynamic fusion strategy to comprehensively evaluate the degree of anomaly of data points and achieve accurate identification of anomalous data. Optimize the initial cluster centers of K-means clustering algorithm using cuckoo algorithm to avoid the problem of traditional K-means falling into local optima. After deleting abnormal data, a missing dataset is formed. The optimized K-means algorithm is used to cluster the data, identify clusters containing missing positions, calculate the mean of the data within the cluster as an estimate of the original true value, and use it as a substitute for abnormal data to complete data repair and cleaning tasks. The test results show that this method has a higher comprehensive cleaning index, smaller mean square error in cleaning data, and less cleaning time cost, indicating that the proposed method can quickly, comprehensively, and accurately complete the task of cleaning abnormal data in time-series data.

关键词

动态融合局部异常因子 / 时序数据 / 改进K-means 聚类算法 / 快速清洗方法 / 异常检测 / 参数优化

Key words

dynamic fusion local anomaly factor / time series data / improved K-means clustering algorithm / quick cleaning method / abnormal detection / parameter optimization

引用本文

引用格式 ▾
郝福忠, 杨宇方, 姬哲, 张静, 王军义. 一种基于动态融合局部异常因子的时序数据快速清洗方法[J]. 自动化技术与应用, 2026, 45(6): 135-139 DOI:10.20033/j.1003-7241.(2026)06-0135-05

登录浏览全文

4963

注册一个新账户 忘记密码

参考文献

AI Summary AI Mindmap
PDF (1412KB)

0

访问

0

被引

详细

导航
相关文章

AI思维导图

/