1.School of Mechanical Engineering, Shijiazhuang Tiedao University, Shijiazhuang 050043, China
2.State Key Laboratory of Mechanical Behavior in Traffic Engineering Structure and System Safety, Shijiazhuang Tiedao University, Shijiazhuang 050043, China
Show less
文章历史+
Received
Accepted
Published
2024-02-17
Issue Date
2026-05-13
PDF (2913K)
摘要
高速列车在实际运营中的轴箱轴承故障数据及样本标签稀缺,极大限制了轴箱轴承故障诊断水平的提升。为此,本文提出了一种融合IFormer(inception transformer)与残差网络(ResNet)的多源域深度迁移学习方法ITR-Net(inception transformer and ResNet)用于高速列车轴箱轴承故障诊断研究。该方法选择多种工况下的有监督标签数据作为多源域,首先采用连续小波变换获取轴承一维振动信号的时频谱图作为模型输入,在ITR-Net中构建IFormer网络和ResNet分别作为通用特征提取器和特定特征提取器,充分学习多源域与目标域数据的特征信息;同时,在迁移模型不同节点位置嵌入多核最大均值差异(MK-MMD)、局部最大均值差异(LMMD)与均方误差(MSE)损失函数,构建了一种新的多源域自适应迁移策略,有效减小多源域间及源域与目标域间的特征分布差异并增强多领域对齐程度。最后,通过分析不同载荷及不同转速下6类轴承故障迁移学习任务,对本文方法进行实验验证。结果表明,本文方法可以有效用于不同工况下轴承迁移学习故障诊断,多源域迁移故障诊断准确率显著高于单源域迁移,并且相比现有的深度适应网络(DAN)、联合适应网络(JAN)、相关对齐损伤(CORAL)网络、域对抗神经网络(DANN)、多特征空间适应网络(MFSAN),本文方法迁移学习诊断结果更为优异。研究结果将为迁移学习应用于轴箱轴承故障诊断提供一条新的途径。
Abstract
Objective Efficiently assessing the health status of axlebox bearings in high-speed trains is crucial for maintaining reliable train operation. Current deep learning-based bearing fault diagnosis faces two significant challenges: it requires many labeled actual fault samples, and the training and test sets need to satisfy independent and identically distributed conditions. Transfer learning relaxes the limitations of these issues for intelligent bearing fault diagnosis, and it utilizes transferable knowledge learned from existing labeled datasets to accomplish tasks within different but similar unlabeled datasets. However, the current transfer learning model based on a single source domain suffers from underutilization of labeled data, reduced transfer diagnosis accuracy, and potential negative transfer when the dataset distribution varies significantly. This study proposes ITR-Net (Inception Transformer and ResNet), a multi-source domain deep transfer learning method that integrates IFormer (Inception Transformer) and ResNet for high-speed train axlebox bearing fault diagnosis research. Methods The method selected supervised labeled data under various operating conditions in the multi-source domain, and first obtained the time-frequency spectrograms of the one-dimensional vibration signals of the bearings as the model input by using the continuous wavelet transform based on the Morlet wavelet basis. The main structure of the proposed network framework consisted of three parts, namely the common feature extractor, the specific feature extractor, and the specific classifier. The common feature extractor adopted the IFormer network, which used the classical structure of the convolutional neural network (CNN) with depth-wise separable convolution (DWConv) and maximum pooling to capture the local information of the input data. It employed the multi-head self-attention (MSA) mechanism in the Transformer network to capture the global information of the input data, so the IFormer network mined more comprehensive feature information. The common feature extractor was utilized to extract domain-invariant features in different source and target domains. The specific feature extractor adopted the classical convolutional neural network ResNet, which efficiently extracted the feature information of the input patch while effectively avoiding gradient disappearance or gradient explosion that can have occurred with the increase of network depth. The specific classifier was utilized to output the classification results for different source domains and the target domain, which facilitated subsequent metrics to measure the distance between the different predicted labels output. In applying the transfer strategy, the study optimized the multi-kernel maximum mean difference (MK-MMD) after the common feature extractor to align the overall distributions of the source and target domains; optimized the local maximum mean difference (LMMD) after the specific feature extractor to enable the model to extract fine-grained information from the input features; optimized the cross-entropy loss (CEloss) after the specific classifier to improve the model classification accuracy on the source domains; and optimized the mean-squared error (MSE) loss after the specific classifier to reduce the differences between the predicted labels of the target domain output by different classifiers. Results and Discussions Six multi-source domain transfer tasks were set using the Integrated High-Speed Train Bearing Experiment Station and the Integrated Power Transmission Fault Diagnosis Experiment Station datasets to demonstrate the effectiveness of the proposed method. Analyzing the results of multi-source domain transfer and single-source domain transfer showed that the effect of multi-source domain transfer was significantly better than that of single-source domain transfer. Comparing the proposed method ITR-Net with other popular transfer learning methods, namely deep adaptive networks (DAN), joint adaptation network (JAN), correlation alignment (CORAL), domain adversarial neural network (DANN), and Multi-feature spatial adaptation networks (MFSAN), the proposed method achieved an average transfer accuracy of 96.66% in six transfer tasks, while the comparative methods achieved 87.24%, 88.30%, 92.45%, 94.11%, and 93.35%, respectively. This result demonstrated the superiority of the proposed method. The t-distribution stochastic neighbor embedding (t-SNE) visualized the feature clustering of the target domain features extracted from the six migration tasks. It was observed that the target domain features in the proposed method achieved more distinct clustering based on different bearing fault types, and the overall clustering of the unsupervised target domain features under the same fault types was improved, which proved the method's effectiveness. In the ablation experiments, the average transfer accuracies of using MK-MMD, LMMD, and MSE alone were 92.30%, 93.19% and 93.18%, respectively; when MK-MMD and LMMD were utilized together, the average migration accuracy reached 95.26%; when the complete loss function was applied, the average accuracy reached the maximum of 98.63%. The ablation results proved that the adaptive migration strategy constructed using MK-MMD, LMMD, and MSE further enhanced the degree of domain feature alignment among multi-source domains, as well as between individual source domains and the target domain, resulting in the best migration learning effect. Conclusions The results showed that the proposed method can fully utilize the data information of multiple source domains, and the transfer using multiple source domains can effectively improve the diagnosis performance of faults in the target domain. The distributions of the source domains and the target domains can be aligned, and the ablation experiments confirmed the effects of different loss functions on the transfer performance of the network models by applying the MK-MMD, LMMD, CELoss, and MSE loss functions to construct the transfer strategy at different network stage positions. The results provide a new approach for applying transfer learning to axlebox bearing fault diagnosis.
基于上述分析,本文提出了一种融合IFormer(inception transformer)与残差网络(ResNet)的多源域深度迁移学习方法ITR-Net(inception transformer and ResNet),用于高铁轴箱轴承故障诊断研究,并通过实验分析验证了本文方法的有效性。本文方法的创新之处如下。
IFormer模块由两个残差连接结构串联构成,其中第1个残差结构中的ITM(inception token mixer)是其模块核心,它深度融合了Transformer与CNN的操作[25],其结构如图1所示。ITM结合了CNN与Transformer的优势,能够同时捕获低频信息和高频信息。首先,将输入特征按通道维度进行分割;然后,把分割后的特征分别输入高频mixer和低频mixer。输入特征图为 X (,上标s为特征图的边长,上标u为特征图的通道数),沿通道维度被分割为高频特征和低频特征(,上标为高频通道数;,上标为低频通道数;),和分别被输送至高频mixer和低频mixer。
LiuZechao, YangShaopu, LiuYongqiang,et al.Adaptive correlated Kurtogram and its applications in wheelset-bearing system fault diagnosis[J].Mechanical Systems and Signal Processing,2021,154:107511. doi:10.1016/j.ymssp.2020.107511
[2]
DengFeiyue, WangHongli, GaoRuiyang,et al.Vibration characteristics analysis of the inner race fault of axlebox bearing under wheel-rail excitation[J].Journal of Hebei University (Natural Science Edition),2023,43(6):561‒570.
QianQuan, QinYi, WangYi,et al.A new deep transfer learning network based on convolutional auto-encoder for mechanical fault diagnosis[J].Measurement,2021,178:109352. doi:10.1016/j.measurement.2021.109352
[5]
LiWeihua, HuangRuyi, LiJipu,et al.A perspective survey on deep transfer learning for fault diagnosis in industrial scenarios:Theories,applications and challenges[J].Mechanical Systems and Signal Processing,2022,167:108487. doi:10.1016/j.ymssp.2021.108487
[6]
HakimM, OmranA A B, AhmedA N,et al.A systematic review of rolling bearing fault diagnoses based on deep learning and transfer learning:Taxonomy,overview,application,open challenges,weaknesses and recommendations[J].Ain Shams Engineering Journal,2023,14(4):101945. doi:10.1016/j.asej.2022.101945
[7]
LvMingzhu, LiuShixun, SuXiaoming,et al.Deep transfer network with multi-kernel dynamic distribution adaptation for cross-machine fault diagnosis[J].IEEE Access,2021,9:16392‒16409. doi:10.1109/access.2021.3053075
[8]
WangXiaoxia, HeHaibo, LiLusi.A hierarchical deep domain adaptation approach for fault diagnosis of power plant thermal system[J].IEEE Transactions on Industrial Informatics,2019,15(9):5139‒5148. doi:10.1109/tii.2019.2899118
[9]
ChenZhuyun, ZhongQi, HuangRuyi,et al.Intelligent fault diagnosis for machinery based on enhanced transfer convolutional neural network[J].Journal of Mechanical Engineering,2021,57(21):96‒105.
WangQin, TaalC, FinkO.Integrating expert knowledge with domain adaptation for unsupervised fault diagnosis[J].IEEE Transactions on Instrumentation and Measurement,2021,71:3500312. doi:10.1109/tim.2021.3127654
[12]
XiangGang, TianKun.Spacecraft intelligent fault diagnosis under variable working conditions via Wasserstein distance-based deep adversarial transfer learning[J].International Journal of Aerospace Engineering,2021,2021:6099818. doi:10.1155/2021/6099818
[13]
TzengE, HoffmanJ, ZhangNing,et al.Deep domain confusion: Maximizing for domain invariance[EB/OL].(2014‒12‒10)[2024‒02‒10].
[14]
BorgwardtK M, GrettonA, RaschM J,et al.Integrating structured biological data by Kernel Maximum Mean Discrepancy[J].Bioinformatics,2006,22(14):e49‒e57. doi:10.1093/bioinformatics/btl242
[15]
LongMingsheng, ZhuHan, WangJianmin,et al.Deep transfer learning with joint adaptation networks[C]//Proceedings of the 34th International Conference on Machine Learning.Sydney:Journal of Machine Learning Research,2017:2208‒2217.
[16]
SiciliaA, ZhaoXingchen, HwangS J.Domain adversarial neural networks for domain generalization:When it works and how to improve[J].Machine Learning,2023,112(7):2685‒2721. doi:10.1007/s10994-023-06324-x
[17]
MaoWentao, LiuYamin, DingLing,et al.A new structured domain adversarial neural network for transfer fault diagnosis of rolling bearings under different working conditions[J].IEEE Transactions on Instrumentation and Measurement,2020,70:3509013. doi:10.1109/tim.2020.3038596
[18]
ZhaoJing, YangShaopu, LiQiang,et al.Reply to Comment on 'A novel transfer learning bearing fault diagnosis method based on multiple-source domain adaptation'[J].Measurement Science and Technology,2022,33(9):098001. doi:10.1088/1361-6501/ac6d48
[19]
RezaeianjouybariB, ShangYi.A novel deep multi-source domain adaptation framework for bearing fault diagnosis based on feature-level and task-specific distribution alignment[J].Measurement,2021,178:109359. doi:10.1016/j.measurement.2021.109359
[20]
ChenghuiLyu, ChengJinjun, HuYangguang,et al.Online fault diagnosing of Rudders based on multi-source domain deep transfer learning[J].Journal of Ordnance Equipment Engineering,2022,43(9):60‒67.
LiuXiaofeng, YooC, XingFangxu,et al.Deep unsupervised domain adaptation:A review of recent advances and perspectives[J].APSIPA Transactions on Signal and Information Processing,2022,11(1):e25. doi:10.1561/116.00000192
[23]
XuYouzhong, HanTianyu, ShiXi,et al.Unsupervised domain adaptation fault diagnosis method using weight-based mask network[C]//Proceedings of the 2023 Global Reliability and Prognostics and Health Management Conference (PHM‒Hangzhou).Hangzhou:IEEE,2023:1‒7. doi:10.1109/phm-hangzhou58797.2023.10482568
[24]
ZhuYongchun, ZhuangFuzhen, WangJindong,et al.Deep subdomain adaptation network for image classification[J].IEEE Transactions on Neural Networks and Learning Systems,2021,32(4):1713‒1722. doi:10.1109/tnnls.2020.2988928
[25]
LiYibin, SongYan, JiaLei,et al.Intelligent fault diagnosis by fusing domain adversarial training and maximum mean discrepancy via ensemble learning[J].IEEE Transactions on Industrial Informatics,2020,17(4):2833‒2841. doi:10.1109/tii.2020.3008010
[26]
GrettonA, SejdinovicD, StrathmannH,et al.Optimal kernel choice for large-scale two-sample tests[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems.Red Hook:Curran Associates,2012:1205‒1213.
[27]
CheChangchang, WangHuawei, NiXiaomei,et al.Domain adaptive deep belief network for rolling bearing fault diagnosis[J].Computers & Industrial Engineering,2020,143:106427. doi:10.1016/j.cie.2020.106427
[28]
SiChenyang, YuWeihao, ZhouPan,et al.Inception transformer[J].Advances in Neural Information Processing Systems,2022,35:23495‒23509.
[29]
LiangPengfei, WangWenhui, YuanXiaoming,et al.Intelligent fault diagnosis of rolling bearing based on wavelet transform and improved ResNet under noisy labels and environment[J].Engineering Applications of Artificial Intelligence,2022,115:105269. doi:10.1016/j.engappai.2022.105269
[30]
LongMingsheng, CaoYue, WangJianmin,et al.Learning transferable features with deep adaptation networks[C]//Proceedings of the 32nd International Conference on Machine Learning.Lille:Journal of Machine Learning Research,2015:97‒105.
[31]
GaninY, UstinovaE, AjakanH,et al.Domain-adversarial training of neural networks[M]//Domain Adaptation in Computer Vision Applications.Cham:Springer International Publishing,2017:189‒209. doi:10.1007/978-3-319-58347-1_10
[32]
ZhuYongchun, ZhuangFuzhen, WangDeqing.Aligning domain-specific distribution and classifier for cross-domain classification from multiple sources[J].Proceedings of the AAAI Conference on Artificial Intelligence,2019,33(1):5989‒5996. doi:10.1609/aaai.v33i01.33015989
[33]
Van der MaatenL, HintonG.Visualizing data using t-SNE[J].Journal of machine learning research,2008,9(86):2579‒2605.