Key Laboratory for Advanced Process Control of Light Industry of Ministry of Education, Institute of Automation, Jiangnan University, Wuxi 214122, China
Objective Kalman filtering (KF), as a widely used state estimation algorithm, plays a crucial role in estimating system state variables. The high-precision Kalman state estimation algorithm requires accurate knowledge of model parameters and noise statistical characteristics. Otherwise, estimation performance significantly degrades, and filter divergence can occur. However, in practical applications, many system model parameters and noise statistical characteristics are often unknown or inaccurate. Therefore, a Q-learning-based Kalman filtering (QL‒KF) algorithm is proposed that learns and estimates simultaneously when model parameters and noise statistical characteristics are unknown. Methods The Q-learning policy iteration algorithm, which was divided into two parts, policy improvement and policy evaluation, was employed to address the issue of unknown model information. In the policy improvement stage, a state-action value function (Q function) that evaluated the estimated state value was defined. Then, a formula transformation was utilized to ensure that the estimated value depended only on observed values rather than model parameters, eliminating the need for model parameters. In addition, two adjustable weight matrices were introduced to calculate the Kalman gain, avoiding reliance on the system noise statistical characteristics. Then, an estimation policy for obtaining system state estimates was derived from the Q function. In the policy evaluation stage, the estimation of the Q function was transformed into the estimation of its information matrix, and the recursive least squares algorithm was applied to identify the information matrix. Afterward, based on the identified information matrix, the estimation policy was followed to execute the corresponding actions and update the estimated values of the state variables. Finally, the proposed algorithm was applied to estimate the state of a two-state polynomial system and the water level of a quadruple water tank system to verify the effectiveness and feasibility of the algorithm. In addition, the proposed algorithm was compared to a joint state and parameter estimation algorithm. Results and Discussions The estimation performance of the QL‒KF algorithm was analyzed under conditions of unknown model parameters and noise statistical characteristics. A Monte Carlo experiment was conducted, and 50 Monte Carlo simulations were performed to enhance the credibility of the simulation. Uncertainty was introduced into each parameter to verify the robustness of the proposed algorithm. The root mean square error (RMSE) and the average RMSE (ARMSE) were used as performance evaluation metrics. For the two-state polynomial system, when both the system process noise and measurement noise were Gaussian noise, the simulation results showed that the RMSE of the QL‒KF algorithm exhibited a strong convergence trend, demonstrating the effectiveness of the algorithm. Because the initial estimates were randomly assigned and the Q-learning algorithm required some data accumulation during application, the initial RMSE was slightly larger and fluctuated, but showed a decreasing trend with an increasing number of iterations and gradually stabilized. Compared to the standard KF algorithm, when the model parameters were known in the KF algorithm, the RMSE value of the KF algorithm was low and very stable. However, when the model parameters of both algorithms were unknown, the proposed QL‒KF algorithm achieved significantly better estimation accuracy than the standard KF algorithm, demonstrating stronger robustness. Compared to the EVIU algorithm (joint state and parameter estimation algorithm), the RMSE of the QL‒KF algorithm was smaller and more stable after convergence, with an average ARMSE reduction of 34.66%, indicating higher estimation accuracy. It also demonstrated stronger robustness under parameter uncertainties. In addition, the algorithm required less computational time, reducing the average running time by 44.38%, and exhibited high real-time performance. When both the system process noise and measurement noise were non-Gaussian noise, the simulation results still showed that the RMSE of the QL‒KF algorithm exhibited a convergence trend, confirming the algorithm's effectiveness. When the system parameters were unknown, the estimation error of the proposed algorithm was lower than that of the KF algorithm and similar to that of the EVIU algorithm. The running time of the QL‒KF algorithm was reduced by 45.03% compared to the EVIU algorithm, indicating higher real-time performance. However, compared to the Gaussian noise system, the estimation error of the QL‒KF system increased, indicating that different types of noise affected the estimation accuracy of the proposed algorithm. For the quadruple water tank system, the RMSE of the QL‒KF algorithm for both state components showed favorable trends, demonstrating the effectiveness of the algorithm. Compared to the EVIU algorithm, the proposed algorithm exhibited stronger robustness under parameter uncertainties, with smaller estimation errors, an average ARMSE reduction of 79.93%, and a decrease in running time of 47.78%, indicating good real-time performance. Conclusions The findings indicate that the proposed QL‒KF algorithm can utilize only observations, without identifying system parameters, to estimate the internal state of systems when the model parameters and noise statistical characteristics are unknown. The estimation accuracy of the algorithm is influenced by the type of system noise. For Gaussian noise systems, the algorithm demonstrates high estimation accuracy, robust performance, and strong real-time capability. However, for non-Gaussian noise systems, the estimation accuracy decreases. Future work will focus on further improving estimation accuracy.
针对模型信息未知时的状态估计问题,有两种主要解决方法。第一种解决方法是先辨识模型参数,再估计内部状态[15‒16]。递归辨识[17‒18]和迭代辨识[19‒20]是两种重要的参数辨识方法。在最近的研究中,研究者应用负梯度搜索和关键项分离技术对Hammerstein输出误差系统的参数辨识问题进行了研究,提出了一种基于关键项分离的辅助模型递归梯度算法[21]。Ding等[22]提出了一种滤波辅助模型分层广义扩展随机梯度辨识算法和一种滤波辅助模型分级多创新广义扩展随机斜率辨识算法来辨识Box‒Jenkins系统的参数。Yang等[23]针对非线性反馈系统的参数辨识问题,提出了一种基于层次梯度的迭代算法,以提高参数辨识精度。通过辨识算法得到精确的辨识参数,再结合状态估计算法,就能完成对系统内部状态的估计。另一种解决方法则是模型参数与状态的联合估计[24‒25]。在最近的研究中,Aslan等[26]提出了一种基于最大似然的粒子平滑期望最大化算法,联合估计血流动力学模型的状态和参数。Abolhasani等[27]通过将系统状态和未知参数组合为增强状态,并利用鲁棒正则化最小二乘法来处理不确定性,提出了一种增强状态鲁棒正则化最小二乘滤波器;然而,如果模型和测量的不确定性很大,该算法可能会表现出高度的保守性。Marcos等[28]采用随机的观点来减少保守性以解决此问题,提出了一种基于估计变化增加不确定性(EVIU)标准的类卡尔曼滤波器,称为EVIU滤波器,将未知参数视为状态增量,在模型参数未知的情况下进行状态估计。然而,上述研究均未考虑过程噪声与测量噪声统计特性未知的情况,且同时估计参数与状态将带来很大的计算负担。而时变且未知的噪声会对估计精度产生重大影响,针对噪声统计特性未知的情况,在最近的研究中,Wang等[29]提出了一种基于Pearson type Ⅶ分布的自适应滑动窗口异常鲁棒KF算法实现联合估计。但该算法仅考虑了部分参数含有不确定性的情况,且尚未解决算法计算复杂度高、运行时间长的问题。
为解决现有算法存在的问题,提出一种基于Q学习的卡尔曼滤波(Q‒learning based KF,QL‒KF)状态估计算法。利用Q学习算法的无模型特性解决模型参数未知的问题:首先,定义一个Q函数来评价状态估计值;接着,通过Q函数推导出获取状态估计值的估计策略;然后,使用递推最小二乘法辨识Q函数信息矩阵;最后,基于辨识结果,遵循估计策略,更新状态估计值,实现模型参数未知情况下的状态估计。
KordestaniM, SafaviA A, SaifM.Recent survey of large-scale systems:Architectures,controller strategies,and industrial applications[J].IEEE Systems Journal,2021,15(4):5440‒5453. doi:10.1109/jsyst.2020.3048951
[2]
TabacekJ, HavlenaV.Reduction of prediction error sensitivity to parameters in Kalman filter[J].Journal of the Franklin Institute,2022,359(3):1303‒1326. doi:10.1016/j.jfranklin.2021.12.019
[3]
ChenYifan, HanHaiqian, ZhangYi,et al.Dynamic inversion of hydrodynamic parameters of plain river network[J].Advanced Engineering Sciences,2019,51(2):13‒20.
LiQinwen, WangZhiqian, WangWenrui,et al.A model predictive obstacle avoidance method based on dynamic motion primitives and a Kalman filter[J].Asian Journal of Control,2023,25(2):1510‒1525. doi:10.1002/asjc.2946
[6]
MiaoKelei, ZhangWenan, QiuXiang.An adaptive unscented Kalman filter approach to secure state estimation for wireless sensor networks[J].Asian Journal of Control,2023,25(1):629‒636. doi:10.1002/asjc.2783
[7]
ParkG.Optimal vehicle position estimation using adaptive unscented Kalman filter based on sensor fusion[J].Mechatronics,2024,99:103144. doi:10.1016/j.mechatronics.2024.103144
ZengJingcong, ShiYuanfeng, DaiKaoshan,et al.Real-time structural displacement estimation by fusing acceleration and displacement data with adaptive Kalman filter[J].Advanced Engineering Sciences,2023,55(4):188‒196.
TripathiR P, SinghA K, GangwarP.Innovation-based fractional order adaptive Kalman filter[J].Journal of Electrical Engineering,2020,71(1):60‒64. doi:10.2478/jee-2020-0009
[16]
JiangLiuyang, ZhangHai.Redundant measurement-based second order mutual difference adaptive Kalman filter[J].Automatica,2019,100:396‒402. doi:10.1016/j.automatica.2018.11.037
[17]
LiuTong, ZhangZengjie, LiuFangzhou,et al.Adaptive observer for a class of systems with switched unknown parameters using DREM[J].IEEE Transactions on Automatic Control,2024,69(4):2445‒2452. doi:10.1109/tac.2023.3309228
[18]
ZhangXianku, ZhaoBaigang, ZhangGuoqing.Improved parameter identification algorithm for ship model based on nonlinear innovation decorated by sigmoid function[J].Transportation Safety and Environment,2021,3(2):114‒122. doi:10.1093/tse/tdab006
[19]
ZhaoBaigang, ZhangXianku.An improved nonlinear innovation-based parameter identification algorithm for ship models[J].Journal of Navigation,2021,74(3):549‒557. doi:10.1017/s0373463321000102
[20]
ShiZhenwei, YangHaodong, DaiMei.The data-filtering based bias compensation recursive least squares identification for multi-input single-output systems with colored noises[J].Journal of the Franklin Institute,2023,360(7):4753‒4783. doi:10.1016/j.jfranklin.2023.01.040
[21]
XuHuan, DingFeng, YangErfu.Modeling a nonlinear process using the exponential autoregressive time series model[J].Nonlinear Dynamics,2019,95(3):2079‒2092. doi:10.1007/s11071-018-4677-0
[22]
GeZhengwei, DingFeng, XuLing,et al.Gradient-based iterative identification method for multivariate equation-error autoregressive moving average systems using the decomposition technique[J].Journal of the Franklin Institute,2019,356(3):1658‒1676. doi:10.1016/j.jfranklin.2018.12.002
[23]
YouJunyao, LiuYanjun, ChenJing,et al.Iterative identification for multiple-input systems with time-delays based on greedy pursuit and auxiliary model[J].Journal of the Franklin Institute,2019,356(11):5819‒5833. doi:10.1016/j.jfranklin.2019.03.018
[24]
LvLei, SunWei, PanJian.Two-stage and three-stage recursive gradient identification of Hammerstein nonlinear systems based on the key term separation[J].International Journal of Robust and Nonlinear Control,2024,34(2):829‒848. doi:10.1002/rnc.7007
[25]
DingFeng, XuLing, ZhangXiao,et al.Recursive identification methods for general stochastic systems with colored noises by using the hierarchical identification principle and the filtering identification idea[J].Annual Reviews in Control,2024,57:100942. doi:10.1016/j.arcontrol.2024.100942
[26]
YangDan, LiuYanjun, DingFeng,et al.Hierarchical gradient-based iterative parameter estimation algorithms for a nonlinear feedback system based on the hierarchical identification principle[J].Circuits,Systems,and Signal Processing,2024,43(1):124‒151. doi:10.1007/s00034-023-02477-1
[27]
PatilP V, VachhaniL, RavitharanS,et al.Sequential state and unknown parameter estimation strategy and its application to a sensor fusion problem[J].IEEE Sensors Journal,2022,22(21):20665‒20675. doi:10.1109/jsen.2022.3199214
[28]
MolaeiA, NikoofardA, SedighA K,et al.Parameter and state estimation of managed pressure drilling system using the optimization-based supervisory framework[J].IEEE Transactions on Control Systems Technology,2023,31(6):2937‒2944. doi:10.1109/tcst.2023.3273192
[29]
AslanS, CemgilA T, AkınA.Joint state and parameter estimation of the hemodynamic model by particle smoother expectation maximization method[J].Journal of Neural Engineering,2016,13(4):046010. doi:10.1088/1741-2560/13/4/046010
[30]
AbolhasaniM, RahmaniM.Robust deterministic least-squares filtering for uncertain time-varying nonlinear systems with unknown inputs[J].Systems & Control Letters,2018,122:1‒11. doi:10.1016/j.sysconle.2018.09.005
[31]
FernandesM R, do ValJ B R, SoutoR F.Robust estimation and filtering for poorly known models[J].IEEE Control Systems Letters,2020,4(2):474‒479. doi:10.1109/lcsys.2019.2951611
[32]
WangKe, WuPanlong, LiXingxiu,et al.An adaptive outlier-robust Kalman filter based on sliding window and Pearson type Ⅶ distribution modeling[J].Signal Processing,2024,216:109306. doi:10.1016/j.sigpro.2023.109306
[33]
HuaJinxing, LiuRuirui, HaoFei.Two-channel false data injection attacks on multi-sensor remote state estimation[J].Asian Journal of Control,2023,25(5):3776‒3791. doi:10.1002/asjc.3067
[34]
GeQuanbo, MaZhongcheng, LiJinglan,et al.Adaptive cubature Kalman filter with the estimation of correlation between multiplicative noise and additive measurement noise[J].Chinese Journal of Aeronautics,2022,35(5):40‒52. doi:10.1016/j.cja.2021.05.004
[35]
XueWei, LuanXiaoli, ZhaoShunyi,et al.An online performance index for the Kalman filter[J].IEEE Transactions on Instrumentation and Measurement,2022,71:1007912. doi:10.1109/tim.2022.3212114
[36]
ZhangTengfei, JiaYingmin.Input-constrained optimal output synchronization of heterogeneous multiagent systems via observer-based model-free reinforcement learning[J].Asian Journal of Control,2024,26(1):98‒113. doi:10.1002/asjc.3183
[37]
ZhaoShunyi, ShmaliyY S, AhnC K,et al.Self-tuning unbiased finite impulse response filtering algorithm for processes with unknown measurement noise covariance[J].IEEE Transactions on Control Systems Technology,2021,29(3):1372‒1379. doi:10.1109/tcst.2020.2991609