基于情感分布的emoji嵌入式表示

曾雪强 ,  孙雨 ,  刘烨 ,  万中英 ,  左家莉 ,  王明文

山东大学学报(理学版) ›› 2024, Vol. 59 ›› Issue (03) : 81 -94.

PDF (6555KB)
山东大学学报(理学版) ›› 2024, Vol. 59 ›› Issue (03) : 81 -94. DOI: 10.6040/j.issn.1671-9352.1.2022.3548

基于情感分布的emoji嵌入式表示

作者信息 +

Emoji embedded representation based on emotion distribution

Author information +
文章历史 +
PDF (6712K)

摘要

提出了一种基于情感分布的emoji嵌入式表示方法(emoji embedded representation based on emotion distribution, EDEER)。EDEER方法采用基于BERT的情绪预测模型软标签,从真实数据中学习emoji嵌入式表示,通过情感分布直接建模emoji在各种情绪上的表达程度,使嵌入式表示中包含emoji的多种情感信息。在包含emoji的中文微博数据集上的多组对比实验表明,本文提出的方法可以有效地学习到与细粒度情绪直接关联的emoji嵌入式表示,构建具有较高情绪表达质量的emoji表示空间。

Abstract

This paper proposes an emoji embedded representation based on emotion distribution (EDEER) method. The EDEER method adopts the soft label of BERT-based emotion prediction model to learn emoji embedded representation from real data, and directly models the expression degree of emoji on various sentiments through emotion distribution, so that the embedded representation contains various emotional information of emoji. Multiple sets of comparative experiments on the Chinese Weibo dataset containing emoji shows that the method proposed in this paper can effectively learn emoji embedded representations that are directly related to fine-grained sentiments, and build an emoji representation space with high emotional expression quality.

关键词

emoji / 情绪分析 / 嵌入式表示 / 情感分布

Key words

emoji / sentiment analysis / embedded representation / emotion distribution

引用本文

引用格式 ▾
曾雪强,孙雨,刘烨,万中英,左家莉,王明文. 基于情感分布的emoji嵌入式表示[J]. 山东大学学报(理学版), 2024, 59(03): 81-94 DOI:10.6040/j.issn.1671-9352.1.2022.3548

登录浏览全文

4963

注册一个新账户 忘记密码

参考文献

[1]

BIRJALIM, KASRIM, HSSANEA B. A comprehensive survey on sentiment analysis: approaches, challenges and trends[J]. Knowledge—Based Systems, 2021, 226: 107134.

[2]

GUPTAS, SINGHA, RANJANJ. Sentiment analysis: usage of text and emoji for expressing sentiments[C]// Advances in Data and Information Sciences: Proceedings of ICDIS 2019. Singapore: Springer, 2020: 477-486.

[3]

LEES, JEONGD, PARKE. MultiEmo: multi—task framework for emoji prediction[J]. Knowledge—Based Systems, 2022, 242: 108437.

[4]

谭皓, 邓树文, 钱涛, . 基于表情符注意力机制的微博情感分析模型[J]. 计算机应用研究, 2019, 36(9): 2647-2650.

[5]

TAN Hao, DENG Shuwen, QIAN Tao, et al. A microblog sentiment analysis model based on emoji attention mechanism[J]. Application Research of Computers, 2019, 36(9): 2647-2650.

[6]

谢丽星, 周明, 孙茂松. 基于层次结构的多策略中文微博情感分析和特征抽取[J]. 中文信息学报, 2012, 26(1): 73-84.

[7]

XIE Lixing, ZHOU Ming, SUN Maosong. Hierarchical structure based hybrid approach to sentiment analysis of Chinese microblog and its feature extraction[J]. Journal of Chinese Information Processing, 2012, 26(1): 73-84.

[8]

EISNER B, ROCKTÄ T, AUGENSTEIN I, et al. Emoji2vec: learning emoji representations from their description[C]// Proceedings of the Fourth International Workshop on Natural Language Processing for Social Media. Stroudsburg: ACL, 2016: 48-54.

[9]

GROVER V. Exploiting emojis in sentiment analysis: a survey[J]. Journal of the Institution of Engineers (India): Series B, 2021, 103(1): 1-14.

[10]

WIJERATNE S, BALASURIYA L, SHETH A, et al. A semantics—based measure of emoji similarity[C]// Proceedings of the International Conference on Web Intelligence. New York: ACM, 2017: 646-653.

[11]

BARBIERI F, RONZANO F, SAGGION H. What does this emoji mean? a vector space skip—gram model for Twitter emojis[C]// Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). Slovenia: ELRA, 2016: 3967-3972.

[12]

LI M, GUNTUKU S, JAKHETIYA V, et al. Exploring (dis—)similarities in emoji—emotion association on Twitter and Weibo[C]// Companion proceedings of the 2019 worldwide web conference. New York: ACM, 2019: 461-467.

[13]

SHOEB A A M, DE MELO G. Emotag1200: understanding the association between emojis and emotions[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2020: 8957-8967.

[14]

王文远, 王大玲, 冯时, . 一种面向情感分析的微博表情情感词典构建及应用[J]. 计算机与数字工程, 2012, 40(11): 6-9.

[15]

WANG Wenyuang, WANG Daling, FENG Shi, et al. A sentiment dictionary construction and application of microblog emoji sentiment dictionary for sentiment analysis[J]. Computer and Digital Engineering, 2012, 40(11): 6-9.

[16]

NOVAK P K, SMILOVIĆ J, SLUBAN B, et al. Sentiment of emojis[J]. PLoS One, 2015, 10(12): e0144296.

[17]

LI D, RZEPKA R, PTASZYNSKI M, et al. HEMOS: a novel deep learning—based fine—grained humor detecting method for sentiment analysis of social media[J]. Information Processing & Management, 2020, 57(6): 102290.

[18]

LI M, LONG Y, QIN L, et al. Emotion corpus construction based on selection from hashtags[C]// Proceedings of the Tenth International Conference on Language Resources and Evaluation. Slovenia: ELRA, 2016: 1845-1849.

[19]

何炎祥, 孙松涛, 牛菲菲, . 用于微博情感分析的一种情感语义增强的深度学习模型[J]. 计算机学报, 2017, 40(4): 18.

[20]

HE Yanxiang, SUN Songtao, NIU Feifei, et al. A deep learning model enhanced with emotion semantics for microblog sentiment analysis[J]. Chinese Journal of Computers, 2017, 40(4): 18.

[21]

FELBO B, MISLOVE A, S⌀GAARD A, et al. Using millions of emoji occurrences to learn any—domain representations for detecting sentiment, emotion and sarcasm[C]// Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2017: 1615-1625.

[22]

SINGH A, BLANCO E, JIN W. Incorporating emoji descriptions improves tweet classification[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: ACL, 2019: 2096-2101.

[23]

DIMSON T. Emoji engineering part 1: machine learning for emoji trends[J]. Instagram Engineering Blog, 2015, 30: 1-10.

[24]

KIMURA M, KATSURAI M. Automatic construction of an emoji sentiment lexicon[C]// Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. New York: ACM, 2017: 1033-1036.

[25]

ZHOU Y, XUE H, GENG X. Emotion distribution recognition from facial expressions[C]// Proceedings of the 23rd ACM International Conference on Multimedia. New York: ACM, 2015: 1247-1250.

[26]

曾雪强, 罗明珠, 陈素芬, . 基于自适应多重多元回归的人脸年龄估计[J]. 江西师范大学学报(自然科学版), 2019, 43(1): 68-75.

[27]

ZENG Xueqiang, LUO Mingzhu, CHEN Sufen, et al. The facial age estimation based on adaptive multivariate multiple regression[J]. Journal of Jiangxi Normal University (Natural Sciences Edition), 2019, 43(1): 68-75.

[28]

ZHAO Z, MA X. Text emotion distribution learning from small sample: a meta—learning approach[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg: ACL, 2019: 3955-3965.

[29]

ZHOU D, QUOST B, FRÉMON T V. Soft label based semi—supervised boosting for classification and object recognition[C]// 2014 13th International Conference on Control Automation Robotics & Vision. Piscataway: IEEE, 2014: 1062-1067.

[30]

FAYEK H M, LECH M, CAVEDON L. Modeling subjectiveness in emotion recognition with deep neural networks: ensembles vs soft labels[C]// 2016 International Joint Conference on Neural Networks. Piscataway: IEEE, 2016: 566-570.

[31]

ZHAO Z, WU S, YANG M, et al. Robust machine reading comprehension by learning soft labels[C]// Proceedings of the 28th International Conference on Computational Linguistics. Berlin: ICCL, 2020: 2754-2759.

[32]

FORNACIARI T, UMA A, PAUN S, et al. Beyond black & white: leveraging annotator disagreement via soft—label multi—task learning[C]// Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: ACL, 2021: 2591-2597.

[33]

WANG X, ZONG C. Distributed representations of emotion categories in emotion space[C]// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Stroudsburg: ACL, 2021: 2364-2375.

[34]

DEVLIN J, CHANG M W, LEE K, et al. BERT: pre—training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: ACL, 2019: 4171-4186.

[35]

姚源林, 王树伟, 徐睿峰, . 面向微博文本的情绪标注语料库构建[J]. 中文信息学报, 2014, 28(5): 83-91.

[36]

YAO Yuanlin, WANG Shuwei, XU Ruifeng, et al. The construction of an emotion annotated corpus on microblog text[J]. Journal of Chinese Information Processing, 2014, 28(5): 83-91.

[37]

LI S, ZHAO Z, HU R, et al. Analogical reasoning on Chinese morphological and semantic relations[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2018: 138-143.

[38]

DEMSZKY D, MOVSHOVITZ—ATTIAS D, KO J, et al. GoEmotions: a dataset of fine—grained emotions[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 4040-4054.

[39]

KIM Y. Convolutional neural networks for sentence classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2014: 1746-1751.

[40]

SCHUSTER M, PALIWAL K K. Bidirectional recurrent neural networks[J]. IEEE Transactions on Signal Processing, 1997, 45(11): 2673-2681.

[41]

VAN DER MAATEN L, HINTON G. Visualizing data using t—SNE[J]. Journal of Machine Learning Research, 2008, 9(11): 2579-2605.

[42]

SONG Y, SHI S, LI J, et al. Directional skip—gram: explicitly distinguishing left and right context for word embeddings[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics. Stroudsburg: ACL, 2018: 175-180.

[43]

JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of tricks for efficient text classification[C]// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg: ACL, 2017: 427-431.

[44]

TANG D, WEI F, YANG N, et al. Learning sentiment—specific word embedding for Twitter sentiment classification[C]// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2014: 1555-1565.

基金资助

国家自然科学基金资助项目(62266021)

江西省教育厅科学技术研究项目(GJJ2200330)

AI Summary AI Mindmap
PDF (6555KB)

2

访问

0

被引

详细

导航
相关文章

AI思维导图

/