Aiming at the complexity and diversity of fine-grained images, where traditional image classification methods exhibit limitations in focusing on fine-grained attributes and perform poorly when handling imbalanced datasets, a threshold-based fine-grained image classification algorithm utilizing deep metric learning was proposed. The focus on fine-grained attributes of images was enhanced by introducing a metric learning approach. Additionally, the classification accuracy was enhanced and the model convergence was expedited by incorporating pairwise loss and agent loss mechanisms. To address the issue of data imbalance, a classifier was devised grounded in threshold analysis techniques. This innovative classifier harnesses threshold analysis to facilitate multi-level classification of fine-grained images, thereby ameliorating the issue of low classification accuracy for certain categories within an imbalanced dataset. The results of these experiments unequivocally demonstrate that the proposed threshold classification algorithm for fine-grained images, based on deep metric learning, outperforms alternative methods in terms of classification accuracy.
HeK, ZhangX, RenS, et al. Deep residual learning for image recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2016: 770-778.
[2]
HuangZ Z, ZhangJ P, ShanH M. When age-invariant face recognition meets face age synthesis: a multi-task learning framework[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Kuala Lumpur, Malaysia, 2021: 7282-7291.
[3]
JiR, WenL, ZhangL, et al. Attention convolutional binary neural tree for fine-grained visual categorization[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 10468-10477.
[4]
WeiX S, XieC W, WuJ, et al. Mask-CNN: localizing parts and selecting descriptors for fine-grained bird species categorization[J]. Pattern Recognition, 2018, 76: 704-714.
[5]
ZhengH, FuJ, ZhaZ J, et al. Learning deep bilinear transformation for fine-grained image representation[J]. Advances in Neural Information Processing Systems, 2019, 32: No.03621.
[6]
ChangD, DingY, XieJ, et al. The devil is in the channels: Mutual-channel loss for fine-grained image classification[J]. IEEE Transactions on Image Processing, 2020, 29: 4683-4695.
[7]
BeraA, WhartonZ, LiuY, et al. SR-GNN: spatial relation-aware graph neural network for fine-grained image categorization[J]. IEEE Transactions on Image Processing, 2022, 31: 6017-6031.
[8]
SundgaardJ V, HarteJ, BrayP, et al. Deep metric learning for otitis media classification[J]. Medical Image Analysis, 2021, 71: No.102034.
[9]
DosovitskiyA, BeyerL, KolesnikovA, et al. An image is worth 16x16 words: transformers for image recognition at scale[J/OL].[2023-08-11].
[10]
GuoH, WangS. Long-tailed multi-label visual recognition by collaborative training on uniform and re-balanced samplings[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Kuala Lumpur, Malaysia, 2021: 15089-15098.
[11]
Movshovitz-AttiasY, ToshevA, LeungT K, et al. No fuss distance metric learning using proxies[C]∥Proceedings of the IEEE International Conference on Computer Vision, Hawaii, USA, 2017: 360-368.
[12]
WangX, HanX, HuangW, et al. Multi-similarity loss with general pair weighting for deep metric learning[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Los Angeles, USA, 2019: 5022-5030.
[13]
International Competition on Ocular Disease Intelligent Recognition[EB/OL]. [2021-11-18].
[14]
RahmanT, KhandakarA, QiblaweyY, et al. Exploring the effect of image enhancement techniques on COVID-19 detection using chest X-ray images[J]. Computers in Biology and Medicine,2021,132:No.104319.
[15]
WangJ, YangL, HuoZ, et al. Multi-label classification of fundus images with efficientnet[J]. IEEE Access, 2020, 8: 212499-212508.
[16]
LinJ, CaiQ, LinM. Multi-label classification of fundus images with graph convolutional network and self-supervised learning[J]. IEEE Signal Processing Letters, 2021, 28: 454-458.
[17]
LiZ, XuM, YangX, et al. Multi-label fundus image classification using attention mechanisms and feature fusion[J]. Micromachines, 2022, 13(6): No.947.
[18]
YangX, YiS. Multi-classification of fundus diseases based on DSRA-CNN[J]. Biomedical Signal Processing and Control, 2022, 77: No.103763.
[19]
AfsharP, HeidarianS, NaderkhaniF, et al. Covid-caps: a capsule network-based framework for identification of COVID-19 cases from X-ray images[J]. Pattern Recognition Letters, 2020, 138: 638-643.
[20]
PanahiA, AskariM R, AkramiM, et al. Deep residual neural network for COVID-19 detection from chest X-ray images[J]. SN Computer Science, 2022, 3(2): No.169.