To address the data imbalance issue faced by traditional graph neural network (GNN) based social bot detection methods in real-world scenarios, a multi-modal social bot detection framework specifically designed for imbalanced data is proposed. Firstly, user semantic and behavioral metadata are integrated to form a comprehensive user representation. Secondly, a heterogeneous graph neural network (HGNN) with a local-global dual aggregation mechanism is designed to effectively capture structural and semantic information, enabling deep characterization of user behavior patterns. Finally, an adaptive cost-sensitive learning module is introduced to dynamically adjust the loss weights of different class samples during training, thereby mitigating model bias caused by data imbalance and significantly improving the detection performance for minority classes. Experimental results show that the proposed method outperforms existing approaches on multiple real-world datasets, providing an effective solution to the class imbalance problem in bot detection.
CUIL, SEOH, TABARM, et al. Deterrent: knowledge guided graph attention network for detecting healthcare misinformation [C]∥Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York,USA: ACM, 2020:492-502.
ILIASL, ROUSSAKII. Detecting malicious activity in Twitter using deep learning techniques[J]. Applied Soft Computing, 2021,107:No.107360.
[5]
LIY Y, JIY P, LIS N, et al. Relevance-aware anomalous users detection in social network via graph neural network[C]∥Procecdings of the 2021 International Joint Conference on Neural Networks. Piscataway, USA: IEEE, 2021:1-8.
[6]
ALHOSSEINI SALI, TAREAF RBIN, NAJAFIP, et al. Detect me if you can: spam bot detection using inductive representation learning[C]∥Proceedings of the 2019 World Wide Web Conference. New York,USA: ACM, 2019:148-153.
[7]
ZHOUM, FENGW Z, ZHUY F, et al. Semi-supervised social bot detection with Initial residual relation attention networks[C]∥Procecdings of the Machine Learning and Knowledge Discovery in Databases: Applied Data Science and Demo Track. Cham, Switzerland: Springer, 2023:207-224.
[8]
YANGY G, WUQ, HEB Y, et al. SEBot: structural entropy guided multi-view contrastive learning for social bot detection[C]∥Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2024:3841-3852.
LIUY H, TANZ X, WANGH, et al. BotMoE: twitter bot detection with community-aware mixtures of modal-specific experts[C]∥Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: ACM, 2023:485-495.
[11]
LEIZ Y, WANH R, ZHANGW Q, et al. BIC: Twitter bot detection with text-graph interaction and semantic consistency[C]∥Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2023:10326-10340.
[12]
SHIS H, QIAOK, CHENC, et al. Over-sampling strategy in feature space for graphs based class-imbalanced bot detection[C]∥Proceedings of the ACM Web Conference 2024. New York, USA: ACM, 2024:738-741.
[13]
LIUY, OTT M, GOYALN, et al. RoBERTa: a robustly optimized BERT pretraining approach[DB/OL].(2019-07-26)[2025-11-25].
[14]
LYUQ S, DINGM, LIUQ, et al. Are we really making much progress?: revisiting, benchmarking and refining heterogeneous graph neural networks[C]∥Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. New York, USA: ACM, 2021:1150-1160.
[15]
CRESCIS, DI PIETROR, PETROCCHIM, et al. Fame for sale: efficient detection of fake Twitter followers[J]. Decision Support Systems, 2015,80:56-71.
[16]
FENGS B, WANH R, WANGN N, et al. TwiBot-20: a comprehensive twitter bot detection benchmark[C]∥Proceedings of the 30th ACM International Conference on Information & Knowledge Management. New York, USA: ACM, 2021:4485-4494.
[17]
SHIS H, QIAOK, LIUZ H, et al. MGTAB: a multi-relational graph-based Twitter account detection benchmark[J]. Neurocomputing, 2025,647:No.130490.
[18]
SHAA S, WANGB, WUX F, et al. Semi-supervised classification for hyperspectral images using edge-conditioned graph convolutional networks[C]∥Proceedings of the 2019 IEEE International Geoscience and Remote Sensing Symposium. Piscataway, USA: IEEE, 2019:2690-2693.
[19]
VELIČKOVIĆP, GUILHEMC, ARANTXAC, et al. Graph attention networks [DB/OL].(2017-10-30)[2025-11-25].
[20]
DENGL Y, WUC W, LIAND F, et al. Markov-driven graph convolutional networks for social spammer detection[J]. IEEE Transactions on Knowledge and Data Engineering, 2023,35(12):12310-12322.
[21]
HAMILTONW L, YINGR, LESKOVECJ. Inductive representation learning on large graphs [DB/OL].(2017-06-07)[2025-11-25].
[22]
ZHUJ, YANY J, ZHAOL X, et al. Beyond homophily in graph neural networks: current limitations and effective designs[DB/OL].(2020-06-20)[2025-11-25].
BOD, WANGX, SHIC, et al. Beyond low-frequency information in graph convolutional networks[C]∥Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2021,35(5):3950-3957.
[25]
LUANS T, HUAC Q, LUQ C, et al. Revisiting heterophily for graph neural networks[DB/OL].(2022-10-14)[2025-11-25].
[26]
FENGS B, WANH R, WANGN N, et al. BotRGCN: Twitter bot detection with relational graph convolutional networks[C]∥Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. New York, USA: ACM, 2021:236-239.
[27]
FENGS B, TANZ X, LIR, et al. Heterogeneity-aware twitter bot detection with relational graph transformers[C]∥Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2022,36(4):3977-3985.