Aiming at the problem that the existing sentiment classification methods generally did not fully consider the user's personalized characteristics and ignore the influence of time factor on the sentiment classification results, a new method for text sentiment classification based on knowledge distillation and comment time is proposed. Firstly, in order to solve the problem of less labeled data with high quality, the RoFormer-Sim generative model is used to augment the training text data. Then, a sentiment score prediction model of comment text based on multi-feature fusion is proposed by introducing comment time attribute to extract users' personalized information from user's historical comments. Finally, in order to improve the generalization performance for cold start users, the knowledge distillation theory is introduced, and SKEP model is used to enhance the versatility of the sentiment classification model based on multi-feature fusion. The experimental results on the real dataset crawled from the Chinese stock page show that compared with typical methods such as SKEP and ELECTRA, the accuracy of the proposed method is improved by 3.1% and 0.9%, and the F1 value is increased by 2.7% and 1.0%, respectively, which verifies its effectiveness in improving the performance of sentiment classification.
随着互联网技术的快速发展,股吧平台成为越来越多投资者进行信息交流和获取的重要平台。通过针对股吧评论的情感分析,可以帮助投资者了解市场动向,提供给投资者相应的投资建议,也可以在一定程度上预测市场波动,帮助企业提前做好应对措施以化解风险。近年来,深度学习因其卓越的分类表现,被更多地应用在情感分类领域。Umer等[1]将极度随机树(Extra tree)与卷积神经网络(Convolutional neural networks, CNN)两个模型单独训练,再通过软投票机制将模型集成,以提高模型的准确率和稳健性。Lan等[2]提出了ALBERT(A lite BERT) 预训练模型,该模型减少了BERT模型整体的参数量,加快了训练速度。但是,该模型的下游微调任务仍需要大量文本数据,因此,针对数据匮乏的中文金融领域无法获得较好的情感分类性能。Liu等[3]提出了FinBERT(BERT for financial text mining)预训练模型,该模型在BERT模型的基础上通过大规模英文金融语料训练得到,解决了英文金融领域缺乏预训练模型的问题,但因训练语料的限制,该模型无法应用在中文金融评论领域。赵亚欧等[4]提出了基于ELMo(Embeddings from language models)和Transformer 的混合模型,并将长短时记忆网络(Long short-term Memory, LSTM)和多头注意力机制引入模型,解决了评论文本双向语义和一词多义的问题。但是,该方法仅提取评论文本自身的情感特征,未能准确挖掘用户个性特征和社会关系对文本情感分析的影响。为此,Yang等[5]利用用户与用户、用户与博文之间的关系构建图,并利用大规模信息网络嵌入算法(Large-scala information network embedding, LINE)得到含有社交关系的节点向量。但是,以上方法未能对用户丰富的历史评论信息进行深入挖掘。为此,蒋宗礼等[6]利用分层的多头注意力机制从多个角度挖掘用户和产品信息,更全面地捕捉用户与产品信息对情感分类的影响,取得了较好的实验结果。
股吧的评论文本有较强的不规范性,充斥着大量的网络用词、口语词,同时有一定的专业性,包含较多投资专业术语。这导致文本的噪声较大,质量合格的标注数据较少。为此,本文采用RoFormer-Sim模型[11]对评论文本进行数据增强。RoFormer-Sim是同时具备相似句生成和相似句检索能力的生成式语言模型。该模型在SimBERT[12]的基础上,增加了仿BART(Bidirectional and auto-regressive transformers)[13]式训练、模型蒸馏以及一般句式语料的训练,是对SimBERT相关技术进一步的整合和优化。
UmerM, SadiqS, NappiM, et al. ETCNN: extra tree and convolutional neural network-based ensemble model for COVID-19 tweets sentiment classification[J]. Pattern Recognition Letters, 2022, 164: 224-231.
[2]
LanZ, ChenM, GoodmanS, et al. ALBERT: a lite bert for self-supervised learning of language representations[C]∥International Conference on Learning Representations, Addis Ababa, Ethiopia, 2020: 1-17.
[3]
LiuZ, HuangD, HuangK, et al. Finbert: a pre-trained financial language representation model for financial text mining[C]∥Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, Yokohama, Japan, 2021: 4513-4519.
ZhaoYa-ou, ZhangJia-chong, LiYi-bin, et al. Sentiment analysis based on hybird model of elmo and transformer[J]. Journal of Chinese Information Processing, 2021, 35(3): 115-124.
[6]
YangJ, ZouX, ZhangW, et al. Microblog sentiment analysis via embedding social contexts into an attentive LSTM[J]. Engineering Applications of Artificial Intelligence, 2021, 97: 104048.
JiangZong-li, ZhangJing. Multi-head attention model with user and product information for sentiment classification[J]. Computer Systems & Applications, 2020, 29(7): 131-138.
[9]
HintonG, VinyalsO, DeanJ. Distilling the knowledge in a neural network[J]. Computer Science, 2015, 14(7): 38-39.
ShaoRen-rong, LiuYu-ang, ZhangWei, et al. A survey of knowledge distillation in deep learning[J]. Chinese Journal of Computers, 2022, 45(8): 1638-1673.
[12]
TianH, GaoC, XiaoX, et al. SKEP: sentiment knowledge enhanced pre-training for sentiment analysis[J/OL]. [2023-07-25].arXiv Preprint arXiv: 2005. 05635v2.
[13]
TurneyP D. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews[J]. Association for Computational Linguistics, 2002, 6: 417-424.
[14]
XieW. Entity linking based on roformer-sim for chinese short texts[J]. Frontiers in Computing and Intelligent Systems, 2023, 4(1): 46-50.
[15]
ZhaoY, LiuS, ZhangQ, et al. Test case classification via few-shot learning[J]. Information and Software Technology, 2023, 160:107228.
[16]
LewisM, LiuY, GoyalN, et al. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension[C]∥Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2020: 7871-7880.
[17]
EbbinghausH. Memory: a contribution to experimental psychology[J]. Annals of Neurosciences, 2013, 20(4): 2004155.
[18]
GroverA, LeskovecJ. Node2vec: scalable feature learning for networks[C]∥Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, USA, 2016: 855-864.
QiaoBai-you, WuTong, YangLu, et al. A text sentiment analysis method based on BiGRU and capsule network[J]. Journal of Jilin University (Engineering and Technology Edition), 2024, 54(7): 2026-2037.
WangYou-wei, TongShuang, FengLi-zhou, et al. New inductive microblog rumor detection method based on graph convolutional network[J]. Journal of Zhejiang University (Engineering Science), 2022, 56(5): 956-996.
ChenJie, WangSi-yu, ZhaoShu, et al. Multi-granular user preferences for document-level sentiment analysis[J]. Journal of Chinese Information Processing, 2023, 37(7): 122-130.
[25]
LiY, NiP, LiG, et al. Inter-personal relation extraction model based on bidirectional GRU and attention mechanism[C]∥IEEE 5th International Conference on Computer and Communications (ICCC), Harbin, China, 2019: 1867-1871.
[26]
GuoB, ZhangC, LiuJ, et al. Improving text classification with weighted word embeddings via a multi-channel textcnn model[J]. Neurocomputing, 2019, 363: 366-374.
[27]
ClarkK, LuongM T, LeQ V, et al. Electra: pre-training text encoders as discriminators rather than generators[J/OL]. [2023-07-26]. arXiv Preprint arXiv: 2003. 10555.
[28]
KamalA, AbulaishM. Cat-bigru: convolution and attention with bi-directional gated recurrent unit for self-deprecating sarcasm detection[J]. Cognitive Computation, 2022, 14: 91-109.
[29]
AhmadW, WangB, MartinP, et al. Enhanced sentiment analysis regarding COVID-19 news from global channels[J]. Journal of Computational Social Science, 2023, 6: 19-57.
[30]
GaoZ, LiZ, LuoJ, et al. Short text aspect-based sentiment analysis based on CNN + BiGRU[J]. Applied Sciences, 2022, 12(5): 12052707.
[31]
AslamN, RustamF, LeeE, et al. Sentiment analysis and emotion detection on cryptocurrency related tweets using ensemble LSTM-GRU model[J]. IEEE Access, 2022, 10: 39313-39324.