与时尚分析[12,13]、图像记忆度预测等任务类似,图像情感分析也具有一定程度的主观性和模糊性,因此,Geng[14]提出使用标签分布式学习(Label distribution learning,LDL)构建从实例到连续标签分布之间的映射关系。区别于传统单标签学习和多标签学习,标签分布式学习可以描述样本实例各个标签的重要程度。近年来,标签分布式学习在视觉情感计算领域中受到了越来越多的关注[15-20]。由Plutchilk情感轮[21]等心理研究可知,不同情感之间是彼此关联的。基于此类问题,诸多学者借助低秩思想挖掘标签之间的隐性相关性[22,23]或借助聚类方法、高斯图模型等构建显式关联结构[24-26]挖掘标签分布中的语义关联性。
为了证明本文模型在视觉情感标签分布预测任务中的有效性,将ESDGCN模型与近几年的先进算法进行对比,主要包括基于低秩结构的标签分布预测算法、基于卷积神经网络的算法。对于基于低秩结构的标签分布预测算法而言,本文使用在对应情感数据集上预训练得到的backbone网络提取视觉特征,并使用主成分分析(Principal component analysis,PCA)算法进行降维获得300维提取特征作为标签分布式学习(Label distribution learning,LDL)算法输入,对比的LDL算法包括AA-BP[14]、AA-KNN[14]、CPNN[16]、EDL-LRL[24]、LDLLC[41]、LDL-SCL[23]。对于卷积神经网络模型而言,本文依次对3种经典卷积神经网络模型使用ImageNet预训练并在情感数据集上进行微调获得情感预测结果,3种对应的经典卷积神经网络为AlexNet、VGGNet和ResNet101。除此之外,本文还对比了最新的3种基于CNN的情感分布预测模型,分别为ACPNN[17]、JCDL[34]和SSDL[42]。ESDGCN和各个对比算法在3个不同公开数据集上的具体实验结果如表6~表8所示,其中,“*”表示该行指标是笔者自行复现该算法,在相应数据集上评估得到的结果。
ZhouL, FanX, MaY, et al. Uncertainty-aware cross-dataset facial expression recognition via regularized conditional alignment[C]//Proceedings of ACM International Conference on Multimedia, New York, USA, 2020: 2964-2972.
[2]
FarzanehA H, QiX. Discriminant distribution-agnostic loss for facial expression recognition in the wild[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Piscataway, USA, 2020: 406-407.
LuYang, WangShi-gang, ZhaoWen-ting, et al. Facial expression recognition based on separability assessment of discrete Shearlet transform[J]. Journal of Jilin University (Engineering and Technology Edition), 2019, 49(5): 1715-1725.
FangMing, ChenWen-qiang. Face micro-expression recognition based on ResNet with object mask[J]. Journal of Jilin University (Engineering and Technology Edition), 2021, 51(1): 303-313.
[7]
HuangF, WeiK, WengJ, et al. Attention-based modality-gated networks for image-text sentiment analysis[J]. ACM Transactions on Multimedia Computing, Communications, and Applications, 2020, 16(3): 1-19.
[8]
JiR, ChenF, CaoL, et al. Cross-modality microblog sentiment prediction via bi-layer multimodal hypergraph learning[J]. IEEE Transactions on Multimedia, 2018, 21(4): 1062-1075.
[9]
JianM, DongJ, GongM, et al. Learning the traditional art of Chinese calligraphy via three-dimensional reconstruction and assessment[J]. IEEE Transactions on Multimedia, 2019, 22(4): 970-979.
[10]
YangJ, SheD, LaiY K, et al. Retrieving and classifying affective images via deep metric learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence, Palo Alto, USA, 2018: 491-498.
[11]
YaoX, SheD, ZhaoS, et al. Attention-aware polarity sensitive embedding for affective image retrieval[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway, USA, 2019: 1140-1150.
[12]
LiZ, LiuJ, ZhuX, et al. Image annotation using multi-correlation probabilistic matrix factorization[C]//Proceedings of the ACM International Conference on Multimedia, New York, USA, 2010: 1187-1190.
[13]
LiZ, TangJ, HeX. Robust structured nonnegative matrix factorization for image representation[J]. IEEE Transactions on Neural Networks and Learning Systems, 2017, 29(5): 1947-1960.
[14]
YangX, SongX, FengF, et al. Attribute-wise explainable fashion compatibility modeling[J]. ACM Transactions on Multimedia Computing, Communications, and Applications, 2021, 17(1): 1-21.
[15]
YangX, SongX, HanX, et al. Generative attribute manipulation scheme for flexible fashion search[C]//Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, New York, USA, 2020: 941-950.
[16]
GengX. Label distribution learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(7): 1734-1748.
[17]
PengK C, ChenT, SadovnikA, et al. A mixed bag of emotions: model, predict, and transfer emotion distributions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Piscataway, USA, 2015: 860-868.
[18]
GengX, YinC, ZhouZ H. Facial age estimation by learning from label distributions[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(10): 2401-2412.
[19]
YangJ, SunM, SunX. Label distribution learning via augmented conditional probability neural network[C]//Proceedings of the AAAI Conference on Artificial Intelligence, Palo Alto, USA, 2017: 224-230.
[20]
ZhouY, XueH, GengX. Emotion distribution recognition from facial expressions[C]//Proceedings of the ACM International Conference on Multimedia, New York, USA, 2015: 1247-1250.
[21]
RenT, JiaX, LiW, et al. Label distribution learning with label-specific features[C]//Proceedings of the International Joint Conference on Artificial Intelligence, San Mateo, USA, 2019: 3318-3324.
[22]
ZhaoS, YaoH, GaoY, et al. Continuous probability distribution prediction of image emotions via multitask shared sparse regression[J]. IEEE Transactions on Multimedia, 2016, 19(3): 632-645.
[23]
PlutchikR. Emotions: a general psychoevolutionary theory[J]. Approaches to Emotion, 1984(1984): 197-219.
[24]
XuM, ZhouZ H. Incomplete label distribution learning[C]//Proceedings of the International Joint Conference on artificial intelligence, San Mateo, USA, 2017: 3175-3181.
[25]
JiaX, LiZ, ZhengX, et al. Label distribution learning with label correlations on local samples[J]. IEEE Transactions on Knowledge and Data Engineering, 2019, 33(4): 1619-1631.
[26]
JiaX, ZhengX, LiW, et al. Facial emotion distribution learning by exploiting low-rank label correlations locally[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, USA, 2019: 9841-9850.
[27]
ChenT, YuF X, ChenJ, et al. Object-based visual sentiment concept analysis and application[C]//Proceedings of the ACM International Conference on Multimedia, New York, USA, 2014: 367-376.
[28]
SuY T, ZhaoW, JingP G, et al. Exploiting low-rank latent gaussian graphical model estimation for visual sentiment distribution[J]. IEEE Transactions on Multimedia, 2022,25: 1243-1255.
MiaoYu-qing, LeiQing-qing, ZhangWan-zhen, et al. Research on image sentiment analysis based on multi-visual object fusion[J] Application Research of Computers, 2021, 38(4): 1250-1255.
ShengJia-chuan, ChenYa-qi, WangJun, et al. Image sentiment classification via deep learning structure optimization[J] Infrared and Laser Engineering, 2020, 49(11): 264-273.
[33]
ChenT, BorthD, DarrellT, et al. Deepsentibank: visual sentiment concept classification with deep convolutional neural networks[J/OL]. [2021-10-25].
[34]
ZhuX, LiL, ZhangW, et al. Dependency exploitation: a unified CNN-RNN approach for visual emotion recognition[C]//Proceedings of the International Joint Conference on Artificial Intelligence, San Mateo, USA, 2017: 3595-3601.
[35]
CamposV, SalvadorA, Giró-i-NietoX, et al. Diving deep into sentiment: understanding fine-tuned CNNs for visual sentiment prediction[C]//Proceedings of the 1st International Workshop on Affect & Sentiment in Multimedia, New York, USA, 2015: 57-62.
[36]
CamposV, JouB, Giro-i-NietoX. From pixels to sentiment: fine-tuning CNNs for visual sentiment prediction[J]. Image and Vision Computing, 2017, 65(1): 15-22.
[37]
YouQ, LuoJ, JinH, et al. Robust image sentiment analysis using progressively trained and domain transferred deep networks[C]//Proceedings of the AAAI Conference on Artificial Intelligence, Palo Alto, USA, 2015: 381-388.
[38]
YangJ, SheD, SunM. Joint image emotion classification and distribution learning via deep convolutional neural network[C]//Proceedings of the International Joint Conference on Artificial Intelligence, San Mateo, USA, 2017: 3266-3272.
XuBing-bing, CenKe-yan, HuangJun-jie, et al. A survey on graph convolutional neural network[J] Chinese Journal of Computers, 2020, 43(5): 755-780.
[41]
ChenT, XuM, HuiX, et al. Learning semantic-specific graph representation for multi-label image recognition[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, Piscataway, USA, 2019: 522-531.
[42]
HeT, JinX. Image emotion distribution learning with graph convolutional networks[C]//Proceedings of the International Conference on Multimedia Retrieval, New York, USA, 2019: 382-390.
[43]
ZhouB, KhoslaA, LapedrizaA, et al. Learning deep features for discriminative localization[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Los Alamitos, USA, 2016: 2921-2929.
[44]
YangJ, SunM, SunX. Label distribution learning via augmented conditional probability neural network[C]//Proceedings of the AAAI Conference on Artificial Intelligence, Palo Alto, USA, 2017: 224-230.
[45]
PengK C, ChenT, SadovnikA, et al. A mixed bag of emotions: model, predict, and transfer emotion distributions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Piscataway, USA, 2015: 860-868.
[46]
JiaX, LiW, LiuJ, et al. Label distribution learning by exploiting label correlations[C]//Proceedings of the AAAI Conference on Artificial Intelligence, Palo Alto, USA, 2018: 3310-3317.
[47]
XiongH, LiuH, ZhongB, et al. Structured and sparse annotations for image emotion distribution learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence, Palo Alto, USA, 2019: 363-370.