基于属性加权的ML-KNN方法

温欣; 李德玉

doi:10.6040/j.issn.1671-9352.2.2023.027

山东大学学报(理学版) ›› 2024, Vol. 59 ›› Issue (03) : 107 -117. DOI: 10.6040/j.issn.1671-9352.2.2023.027

基于属性加权的ML-KNN方法

温欣 ¹ ,
李德玉 ¹^,²^,^*

作者信息 +

The ML-KNN method based on attribute weighting

Xin WEN ¹ ,
Deyu LI ¹^,²^,^*

Author information +

文章历史 +

PDF (3853K)

摘要

提出了一种基于属性加权的ML-KNN方法。首先使用变精度邻域粗糙集识别来自每一个标记的决策类非正域中的样本,并构造异质样本对；然后基于属性对异质样本对的区分能力评估不同属性对于分类的重要度；最后计算样本之间的加权距离获得其近邻分布,且基于最大化后验概率的原则实现多标记分类。在10个公开的多标记数据集上的实验结果验证了所提方法的有效性。

Abstract

A ML-KNN method based on attribute weighting has been proposed. To be specific, we first identify samples from the non-positive regions of decision classes by means of the variable precision neighborhood rough set model with respect to each label and construct the heterogeneous sample pairs. Then, the significance of different attributes for classification is evaluated based on their discernibility for the heterogeneous sample pairs. Finally, the weighted distances between samples are calculated in order to obtain the nearest neighbor distributions of samples. At the same time, based on the principle of maximizing the posterior probability, the multi-label classification is implemented. Further, the experimental results on ten public multi-label datasets verify the effectiveness of the proposed method.

关键词

多标记分类 / 属性重要度 / 邻域粗糙集 / 分类不确定性 / 异质样本对

Key words

multi-label classification / attribute significance / neighborhood rough set / uncertainty of classification / heterogeneous sample pair

引用本文

引用格式 ▾

温欣,李德玉. 基于属性加权的ML-KNN方法[J]. 山东大学学报(理学版), 2024, 59(03): 107-117 DOI:10.6040/j.issn.1671-9352.2.2023.027

登录浏览全文

4963

注册一个新账户忘记密码

参考文献

原文顺序 | 出版日期 | 本文引用

[1]	YU Ying, PEDRYCZ W, MIAO Duoqian. Multi—label classification by exploiting label correlations[J]. Expert Systems with Applications, 2014, 41(6): 2989-3004.

[2]	TSOUMAKAS G, KATAKIS I. Multi—label classification: an overview[J]. International Journal of Data Warehousing and Mining, 2007, 3(3): 1-13.

[3]	ZHANG Minling, ZHOU Zhihua. A review on multi—label learning algorithms[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(8): 1819-1837.

[4]	KASHEF S, NEZAMABADI—POUR H. A label—specific multi—label feature selection algorithm based on the Pareto dominance concept[J]. Pattern Recognition, 2019, 88: 654-667.

[5]	LEE J, SEO W, PARK J H, et al. Compact feature subset—based multi—label music categorization for mobile devices[J]. Multimedia Tools and Applications, 2019, 78(4): 4869-4883.

[6]	WANG R, RIDLEY R, SU X A, et al. A novel reasoning mechanism for multi—label text classification[J]. Information Processing and Management, 2021, 58(2): 102441.

[7]	FABRIS F, FREITAS A A. Dependency network methods for hierarchical multi—label classification of gene functions[C]// 2014 IEEE Symposium on Computational Intelligence and Data Mining. Piscataway: IEEE, 2014: 241-248.

[8]	AKHAND B, DEVI V S. Multi—label classification of discrete data[C]// IEEE International Conference on Fuzzy Systems. Piscataway: IEEE, 2013: 1-5.

[9]	BOUTELL M R, LUO J B, SHEN X P, et al. Learning multi—label scene classification[J]. Pattern Recognition, 2004, 37(9): 1757-1771.

[10]	TSOUMAKAS G, KATAKIS I, VLAHAVAS I P. Random k—label sets for multilabel classification[J]. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(7): 1079-1089.

[11]	READ J, PFAHRINGER B, HOLMES G, et al. Classifier chains for multi—label classification[C]// Machine Learning and Knowledge Discovery in Databases. European Conference, Berlin: Springer, 2009, 5782: 254-269.

[12]	ZHANG Minling, ZHOU Zhihua. ML—kNN: a lazy learning approach to multi—label learning[J]. Pattern Recognition, 2007, 40(7): 2038-2048.

[13]	PAKRASHI A, NAMEE B M. Stacked—MLkNN: a stacking based improvement to multi—label k—nearest neighbours[C]// First International Workshop on Learning with Imbalanced Domains: Theory and Applications. New York: PMLR, 2017, 74: 51-63.

[14]	WANG Dengbao, WANG Jingyuan, HU Fei, et al. A locally adaptive multi—label k—nearest neighbor algorithm[C]// Advances in Knowledge Discovery and Data Mining—22nd Pacific—Asia Conference. Berlin: Springer, 2018, 10937: 81-93.

[15]	SADHUKHAN P, PALIT S. Multi—label learning on principles of reverse k—nearest neighbourhood[J/OL]. Expert Systems, 2020. DOI: 10.1111/exsy.12615.

[16]	段洁, 胡清华, 张灵均, 等 . 基于邻域粗糙集的多标记分类特征选择算法[J]. 计算机研究与发展, 2015, 52(1): 56-65.

[17]	DUAN Jie, HU Qinghua, ZHANG Lingjun, et al. Feature selection for multi—label classification based on neighborhood rough sets[J]. Journal of Computer Research and Development, 2015, 52(1): 56-65.

[18]	张文修, 吴伟志, 梁吉业, 等 . 粗糙集理论与方法[M]. 北京:科学出版社, 2001: 232.

[19]	ZHANG Wenxiu, WU Weizhi, LIANG Jiye, et al. Rough set theory and methods[M]. Beijing: Science Press, 2001: 232.

[20]	HU Qinghua, YU Daren, LIU Jinfu, et al. Neighborhood rough set based heterogeneous feature subset selection[J]. Information Sciences, 2008, 178(18): 3577-3594.

[21]	张晶, 李德玉, 王素格, 等 . 基于稳健模糊粗糙集模型的多标记文本分类[J]. 计算机科学, 2015, 42(7): 270-275.

[22]	ZHANG Jing, LI Deyu, WANG Suge, et al. Multi—label text classification based on robust fuzzy rough set model[J]. Journal of Computer Science, 2015, 42(7): 270-275.

[23]	DAI Jianhua, HU Hu, WU Weizhi, et al. Maximal—discernibility—pair—based approach to attribute reduction in fuzzy rough sets[J]. IEEE Transactions on Fuzzy Systems, 2018, 26(4): 2174-2187.

[24]	QIAN Wenbin, HUANG Jintao, WANG Yinglong, et al. Label distribution feature selection for multi—label classification with rough set[J]. International Journal of Approximate Reasoning, 2021, 128: 32-55.

[25]	温欣, 李德玉, 王素格. 一种基于邻域关系和模糊决策的特征选择方法[J]. 南京大学学报(自然科学版), 2018, 54(4): 733-741.

[26]	WEN Xin, LI Deyu, WANG Suge. A method for feature selection based on neighborhood relation and fuzzy decision[J]. Journal of Nanjing University (Natural Sciences), 2018, 54(4): 733-741.

[27]	HUANG Jun, LI Guorong, WANG Shuhui, et al. Multi—label classification by exploiting local positive and negative pairwise label correlation[J]. Neurocomputing, 2017, 257: 164-174.