基于样本相关性的层次特征选择算法

史春雨 ,  毛煜 ,  刘浩阳 ,  林耀进

山东大学学报(理学版) ›› 2024, Vol. 59 ›› Issue (03) : 61 -70.

PDF (1206KB)
山东大学学报(理学版) ›› 2024, Vol. 59 ›› Issue (03) : 61 -70. DOI: 10.6040/j.issn.1671-9352.7.2023.1073

基于样本相关性的层次特征选择算法

作者信息 +

Hierarchical feature selection algorithm based on instance correlations

Author information +
文章历史 +
PDF (1234K)

摘要

提出了基于样本相关性的层次特征选择算法(hierarchical feature selection algorithm based on instance correlations, HFSIC)以进一步提高分层分类特征选择算法的性能。在使用稀疏正则项去除不相关特征之后,将层次结构中的父子关系与特征空间中样本之间的重构关系相结合,学习同一子树下各类别的样本相关性,利用递归正则优化输出特征权重矩阵。在衡量样本相关性时,将重构系数矩阵整合到训练模型中,同时利用l2,1范数去除不相关的和冗余的特征。使用加速近端梯度法解决所提模型的优化问题,并在多个评价指标下评估所提算法的优越性。实验结果表明,所提方法在5个数据集上的表现优于其他算法,验证了该算法的有效性。

Abstract

A hierarchical feature selection algorithm based on instance correlations (HFSIC) is proposed to further improve the performance of the hierarchical feature selection algorithm. After using sparse regularization items to remove irrelevant features, the parent-child relationship in the hierarchical structure with the reconstruction relationship between samples in the feature space are combined. The correlation of samples of each category under the same subtree are learned. Recursive regularization to optimize the output feature weight matrix is used. When measuring the sample correlation, the reconstructed coefficient matrix is integrated into the training model, and the norm is used to remove irrelevant and redundant features. The optimization problem of the proposed model is solved using the accelerated proximal gradient method, and the superiority of the proposed algorithm is evaluated under multiple evaluation metrics. The experimental results show that the proposed method outperforms the other algorithms on five datasets. The test verifies the effectiveness of the proposed algorithm.

关键词

特征选择 / 层次结构 / 样本相关性 / 递归正则化

Key words

feature selection / hierarchical structure / instance correlation / recursive regularization

引用本文

引用格式 ▾
史春雨,毛煜,刘浩阳,林耀进. 基于样本相关性的层次特征选择算法[J]. 山东大学学报(理学版), 2024, 59(03): 61-70 DOI:10.6040/j.issn.1671-9352.7.2023.1073

登录浏览全文

4963

注册一个新账户 忘记密码

参考文献

[1]

王忠伟, 陈叶芳, 钱江波, . 基于LSH的高维大数据k近邻搜索算法[J]. 电子学报, 2016, 44(4): 906-912.

[2]

WANG Zhongwei, CHEN Yefang, QIAN Jiangbo, et al. LSH—based algorithm for k nearest neighbor search on big data[J]. Acta Electronica Sinica, 2016, 44(4): 906-912.

[3]

胡清华, 王煜, 周玉灿, . 大规模分类任务的分层学习方法综述[J]. 中国科学(信息科学), 2018, 48(5): 487-500.

[4]

HU Qinghua, WANG Yu, ZHOU Yucan, et al. A review on hierarchical learning methods for large scale classification task[J]. Sci Sin Inform, 2018, 48(5): 487-500.

[5]

DUDA R O, HART P E, STORK D G. Pattern classification[M]. Hoboken: Wiley, 2000.

[6]

LIU Xinxin, ZHOU Yucan, ZHAO Hong. Robust hierarchical feature selection driven by data and knowledge[J]. Information Sciences, 2021, 551: 341-357.

[7]

WANG Jian, ZHANG Huaqing, WANG Junze, et al. Feature selection using a neural network with group lasso regularization and controlled redundancy[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 32(3): 1110-1123.

[8]

林耀进, 白盛兴, 赵红, . 基于标签关联性的分层分类共有与固有特征选择[J]. 软件学报, 2022, 33(7): 2667-2682.

[9]

LIN Yaojin, BAI Shengxing, ZHAO Hong, et al. A label correlation based common and specific feature selection for large—scale hierarchical classification[J]. Journal of Software, 2022, 33(7): 2667-2682.

[10]

FREEMAN C, KULIC D, BASIR O. Joint feature selection and hierarchical classifier design[C]// 2011 IEEE International Conference on Systems, Man and Cybernetics. Waterloo: IEEE, 2011: 1728-1734.

[11]

FREEMAN C, KULIC D, BASIR O, et al. Feature—selected tree—based classification[J]. IEEE Transactions on Cybernetics, 2013, 43(6): 1990-2004.

[12]

GRIMAUDO L, MELLIA M, BARALIS E. Hierarchical learning for fine grained internet traffic classification[C]// 2012 8th International Wireless Communications and Mobile Computing Conference (IWCMC). Copenhagen: IEEE, 2012: 463-468.

[13]

ZHAO Hong, HU Qinghua, ZHU Pengfei, et al. A recursive regularization based feature selection framework for hierarchical classification[J]. IEEE Transactions on Knowledge and Data Engineering, 2021, 33(7): 2833-2846.

[14]

TUO Qianjuan, ZHAO Hong, HU Qinghua. Hierarchical feature selection with subtree based graph regularization[J]. Knowledge—Based Systems, 2018, 163(1): 996-1008.

[15]

DE ABREU I B M, MANTOVANI R G, CERRI R. Incorporating instance correlations in multi—label classification via label—space[C]// 2017 International Joint Conference on Neural Networks (IJCNN). Anchorage: IEEE, 2017: 581-588.

[16]

HUANG Shengjun, ZHOU Zhihua. Multi—label learning by exploiting label correlations locally[C]// Proceedings of the AAAI Conference on Artificial Intelligence. Toronto, Ontario: AAAI, 2012, 26(1): 949-955.

[17]

HUANG Jun, LI Guorong, HUANG Qingming, et al. Joint feature selection and classification for multilabel learning[J]. IEEE Transactions on Cybernetics, 2018, 48(3): 876-889.

[18]

LI Junlong, LI Peipei, HU Xuegang, et al. Learning common and label—specific features for multi—label classification with correlation information[J]. Pattern Recognition, 2022, 121: 108-259.

[19]

LI Jundong, CHENG Kewei, WANG Suhang, et al. Feature selection: a data perspective[J]. ACM Computing Surveys (CSUR), 2017, 50(6): 1-45.

[20]

刘浩阳, 林耀进, 刘景华, . 由粗到细的分层特征选择[J]. 电子学报, 2022, 50(11): 2778-2789.

[21]

LIU Haoyang, LIN Yaojin, LIU Jinghua, et al. Hierarchical feature selection from coarse to fine[J]. Acta Electronica Sinica, 2022, 50(11): 2778-2789.

[22]

LIN Zhouchen, GANESH A, WRIGHT J, et al. Fast convex optimization algorithms for exact recovery of a corrupted low—rank matrix[J]. Computational Advances, 2009, 10: 1-18.

[23]

DEKEL O, KESHET J, SINGER Y. Large margin hierarchical classification[C]// Proceedings of the Twenty—first International Conference on Machine Learning. New York: ACM, 2004: 1-8.

[24]

SILLA C N, FREITAS A A. A survey of hierarchical classification across different application domains[J]. Data Mining & Knowledge Discovery, 2011, 22(1/2): 31-72.

[25]

NIE Feiping, HUANG Heng, CAI Xiao, et al. Efficient and robust feature selection via joint ℓ2,1—norms minimization [C]// Proceedings of the 23rd International Conference on Neural Information Processing Systems. Kyoto: IEEE, 2010: 1813-1821.

[26]

PENG Hanchuan, LONG Fuhui, DING C. Feature selection based on mutual information criteria of max—dependency, max—relevance, and min—redundancy[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(8): 1226-1238.

[27]

DEMIAR J, SCHUURMAN S D. Statistical comparisons of classifiers over multiple datasets[J]. Journal of Machine Learning Research, 2006, 7(1): 1-30.

基金资助

国家自然科学基金资助项目(62076116)

福建省自然科学基金资助项目(2022J01914)

AI Summary AI Mindmap
PDF (1206KB)

3

访问

0

被引

详细

导航
相关文章

AI思维导图

/