The advancement of Internet technology has led to the rapid rise of e-commerce platforms. In order to accurately analyze the sentiment orientation of customer reviews on these platforms, a text sentiment analysis method based on the BERT - MABL model is proposed. Traditional sentiment analysis models face challenges such as the inability to incorporate contextual semantic information, difficulty in resolving word ambiguity, and the lack of assigning sentiment weights to vocabulary.The proposed method first utilizes BERT to extract word embeddings that capture deep semantic information. To enable the model to learn contextual semantics, the Bi - LSTM model is employed as the backbone. Additionally, the MABL model is constructed by introducing a multi - head attention mechanism to address the issue of sentiment weight allocation that Bi - LSTM cannot handle. This allows the model to focus on text information from multiple perspectives.Experimental results demonstrate that the customer review sentiment analysis method based on the BERT - MABL model achieves significant improvements over traditional models. With a multi - head count of 8 and a Dropout rate of 0.3, the accuracy of the BERT - MABL model on three datasets reaches 89.28%, 90.20%, and 90.85%, respectively. The corresponding F1 scores are 87.16%, 89.04%, and 88.24%.
情感分析旨在识别和分析文本中的情感倾向,在商业、政府和教育等多个领域都具有极大的价值,大量的研究也证明了它的实用性.文献[4]通过使用数据挖掘算法分析了Twitter的航空公司讨论数据,并确定表情符号在情绪传达中起着重要作用.文献[5]将long short term memory(LSTM)与Word2Vec结合以提高情感分析性能.文献[6]通过分词和清洗数据提出了一个用于评论意见挖掘的bidirectional long short term memory(Bi - LSTM)模型,并使用预训练的Word2Vec模型生成词向量.文献[7]提出了一种使用bidirectional encoder representations from transformers(BERT)预训练模型和卷积神经网络融合的文本情感分析模型,其分析效果优于传统模型.文献[8]评估了各种机器学习模型对情感分类的效果,其中BERT模型在情感分析中达到85.4%的准确率.文献[9]将convolutional neural networks(CNN)和Bi - LSTM的模型结合用于情感分析,其中CNN层接收特征嵌入作为输入并输出低级特征,而Bi-LSTM用于分类.文献[10]通过在CNN - BiLSTM模型上融合多头自注意力机制提升了模型的泛化能力,并在电商评论数据集上取得了91.481%的准确率.文献[11]利用双向门控循环单元捕捉文本中的上下文信息,并结合注意力机制来强调文本中关键词的重要性,该方法能够更准确地识别文本情感.
BERT是Google在2018年提出的一种预训练语言表示模型,BERT建立在Transformer模型的编码器基础上[12].Transformer通过自注意力机制允许模型在处理序列的每个元素时,同时考虑序列中的所有其他元素,从而有效捕捉长距离依赖关系.与embeddings from language models(ELMo)使用2个反向LSTM不同,BERT采用了Transformer编码器,实现了真正的双向上下文表示,BERT能够捕获深度双向语境信息.
BERT的2个核心任务分别是掩码语言模型和下一句预测.
1) 掩码语言模型(masked language model,MLM):在这个任务中,BERT随机遮蔽输入文本的一部分词汇,然后模型需要预测这些被遮蔽的词汇.这使得模型能够学习到一个词的上下文表示,因为它必须考虑左右两边的上下文信息来预测缺失的词.
为了测试BERT - MABL模型的性能,利用来自不同领域的3个基准数据集进行评估:Products、Hotels和Movies.3个数据集均为二分类,分别用积极和消极作为情感标签.其中Products评论数据集包含了不同产品的评论:佳能、尼康、录音笔和Apex AD 2600[13];Hotels评论数据集来源于tripadvisor.com[14];最后,Movies评论数据集是从斯坦福大学使用的大型电影评论数据集中收集的[15].数据集如表1所示.
KAUSARM A, FAGEERIS O, SOOSAIMANICKAMA. Sentiment classification based on machine learning approaches in amazon product reviews[J]. Engineering, Technology & Applied Science Research,2023,13(3):10849 - 10855.
[2]
RANAM R R, REHMANS U, NAWAZA, et al. A conceptual model for decision support systems using aspect based sentiment analysis[J]. Proceedings of the Romanian Academy Series A - Mathematics Physics Technical Sciences Information Science, 2021,22(4):371 - 380.
[3]
ELANGOVAND, SUBEDHAV. Adaptive particle grey wolf optimizer with deep learning - based sentiment analysis on online product reviews[J]. Engineering, Technology & Applied Science Research, 2023, 13(3): 10989 - 10993.
[4]
ULLAHM A, MARIUMS M, BEGUMS A, et al. An algorithm and method for sentiment analysis using the text and emoticon[J]. ICT Express, 2020,6(4):357 - 360.
[5]
GONDHIN K, CHAAHAT, SHARMAE, et al. Efficient long short - term memory - based sentiment analysis of e - commerce reviews[J]. Computational intelligence and neuroscience, 2022,2022(1):3464524.
[6]
VIMALIJ S, MURUGANS. A text based sentiment analysis model using bi - directional LSTM networks[C]//2021 6th International conference on communication and electronics systems (ICCES). IEEE,2021:1652 - 1658.
DHOLAK, SARADVAM. A comparative evaluation of traditional machine learning and deep learning classification techniques for sentiment analysis[C]//2021 11th international conference on cloud computing, data science & engineering (Confluence). IEEE,2021:932 - 936.
[9]
VATAMBETIR, MANTENAS V, KIRANK V D, et al. Twitter sentiment analysis on online food services based on elephant herd optimization with hybrid deep learning technique[J]. Cluster Computing,2024,27(1):655 - 671.
DEVLINJ, CHANGM W, LEEK, et al. Bert: Pre - training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers). 2019: 4171 - 4186.
[13]
HUM, LIUB. Mining and summarizing customer reviews[C]//Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining.2004:168 - 177.
[14]
ALAMM H, RYUW J, LEES K. Joint multi - grain topic sentiment: modeling semantic aspects for online reviews[J]. Information Sciences,2016,339:206 - 223.
[15]
MAASA, DALYR E, PHAMP T, et al. Learning word vectors for sentiment analysis[C]//Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies.2011:142 - 150.