A novel cross-age face recognition model is proposed in the paper, which integrates CNN and Transformer architectures into it. In the model, the full information of facial features is extracted by using the deep separable T2T-ViT network; and then, the age and identity features are nonlinearly separated by using a multi-scale attention decomposition module; finally, the feature decomposition is constrained through mutual information minimization, cross-entropy, and the Arcface function. By the proposed model, we obtain impressive accuracy rates of 94.97%,99.51% and 95.81%, approaching to or even surpassing the performance of state-of-the-art (SOTA) on three benchmark datasets, FG-NET, CACD_VS and CALFW, respectively, indicating that the proposed model is able to comprehensively extract facial information and effectively separate features, thus leading to advanced recognition performance.
YUANL, CHENY, WANGT,et al.Tokens-to-token vit:Training vision transformers from scratch on ImageNet[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV). Los Angeles: IEEE Computer Society,2021: 538-547.
[2]
HUANGZ, ZHANGJ, SHANH.When age-invariant face recognition meets face age synthesis:A multi-task learning framework[J].arXiv preprint arXiv,2021:2103.01520.
[3]
HUANGY, HUH. A parallel architecture of age adver-sarial convolutional neural network for cross-age face recognition[J]. IEEE Transactions on Circuits and Systems for Video Technology,2020,31(1):148-159.
[4]
DOSOVITSKIYA, BEYERL, KOLESNIKOVA,et al.An image is worth 16×16 words:Transformers for image recognition at scale[J].arXiv preprint arXiv,2020:2010.11929.
GONGD, LIZ, LIND,et al.Hidden factor analysis for age invariant face recognition[C]//2013 IEEE International Conference on Computer Vision (ICCV). Los Angeles: IEEE Computer Society, 2013: 2872-2879.
DENGJ, GUOJ, YANGJ,et al.Arcface: Additive angular margin loss for deep face recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,44(10):5962-5979.
HOUX, LIY, WANGS.Disentangled representation for age-invariant face recognition:A mutual information minimization perspective[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV).Los Angeles: IEEE Computer Society,2021:3672-3681.
[15]
ZHANGK, ZHANGZ, LIZ,et al.Joint face detection and alignment using multitask cascaded convolutional networks[J]. IEEE signal processing letters,2016,23(10):1499-1503.
[16]
KEMELMACHER-SHLIZERMANI, SEITZS, MILLERD,et al.The megaFace benchmark:1 million faces for recognition at scale[J].arXiv preprint arXiv,2015:1512.00596.
[17]
ROTHER, TIMOFTER, VAN GOOLL.Dex:Deep expectation of apparent age from a single image[C]//2015 IEEE International Conference on Computer Vision Workshop (ICCVW).Los Angeles: IEEE Computer Society,2015: 252-257.
[18]
LIH, ZOUH, HUH.Modified hidden factor analysis for cross-age face recognition[J].IEEE Signal Processing Letters, 2017,24(4):465-469.