1.College of Wildlife and Protected Area,Northeast Forestry University,National Forestry and Grassland Administration Feline Research Center,Harbin 150040,China
2.College of Computer and Control Engineering,Northeast Forestry University,Harbin 150040,China
Northeast China holds a significant position in the country’s biological resources. Influenced by its long, cold winters and abundant snowfall, field researchers often prioritize using snow footprints for wildlife identification and monito-ring during the winter months. However, identifying species based on footprints requires highly qualified personnel and remains challenging when distinguishing visually similar footprints. Therefore, developing a simple, efficient, and accurate method to identify wildlife species through snow footprint images has become crucial for wildlife monitoring in this region. This study proposed a method for identifying wild animal species based on snow footprints and deep learning. Six mammalian species in northeast China were studied, and a snow footprint dataset was created by manually collecting and proce-ssing footprint images in the field. The segmentation model SOLOv2 was selected as the segmentation network to automatically extract the contour features of snow footprints, exploring the instance segmentation accuracy and effectiveness of each species. Selected the Swin-Transformer-Tiny classification model as the classification network to achieve automatic classification and recognition of snow footprint image contour features, and explored the classification and recognition accuracy of various species. The results indicate that the snow footprint recognition and classification method proposed in this study has good reliability; In classification networks, automatic species classification can achieve an accuracy of 89.9% by manually annotating snow footprint contours; In the snow footprint recognition network, the classification network is trained based on the results of its segmentation network, and the overall automated recognition is completed, achieving a classification recognition accuracy of 85.3%. Overall, by using snow footprint images and deep learning techniques, a snow footprint recognition network is constructed using segmentation classification networks, achieving automated species classification and recognition based on animal snow footprint images, and it has potential in terms of application prospects in winter, providing a simple and efficient innovative method for monitoring and protecting wildlife in northern China.
近年来,随着计算机算力的发展以及深度学习理论的突破。基于深度学习的野生动物识别技术相比传统方法展现出了显著优势,其能够高效、准确地实现图像特征提取,且无须人工干预。基于深度学习的图像检测与识别技术已经广泛应用到动物物种识别中[11],例如,王文成等[12]对10种鱼类进行分类识别,其使用的网络框架为ResNet50,最终识别成功率达到93.33%;Norouzzadeh et al.[13]利用Snapshot Serengeti dataset(目前最大的已标注野生动物数据集)和深度学习分类算法,对比了9种深度学习模型在48种野生动物识别任务中的表现,其中ResNet152架构以93.8%的准确率表现最优;Guo et al.[14]基于Tri-AI技术对41种灵长类动物进行识别,识别准确率高达94.1%;齐建东等[15]基于改进的BS-ResNeXt-50模型对密云地区野生动物图像进行物种识别,最高类别准确率达98.6%。
SOLOv2在SOLO基础上将掩码分支解耦为卷积核学习分支和特征学习分支。卷积核学习分支从特征图中生成每个实例的动态卷积核权重,而特征学习分支提取适用于生成掩码的实例特征图,二者联动以生成实例掩码。如图5所示,在分割流程中,雪地足迹的原始影像首先通过ResNet特征提取网络和FPN生成多层特征图I。分类分支通过卷积操作对I中各层特征图进行处理,生成网格单元的实例类别概率分布。在掩码分支中,特征图I经卷积核学习分支生成动态卷积核权重(大小为S × S × D,其中S为网格数,D为动态卷积核参数总数)。同时,特征图I通过特征学习分支逐级上采样与加权融合,生成高分辨率单一特征图(分辨率为原始影像的1/4)。最终,特征学习分支的输出与动态卷积核结合,生成对应实例的分割掩码[25]。
3.3 雪地足迹影像分类网络
采用Swin-Transformer-Tiny构建雪地足迹影像的识别模型,Swin-Transformer-Tiny网络结构如图6所示。首先,将输入的H × W × 3的图像分割为N(N =)个非重叠的小块(patch),每个patch的大小为P × P × 3。其次,每个patch通过线性投影转化为高维特征,记为patch tokens,再经线性嵌入层(liner embedding)转换为C维向量。最后将该向量送入Swin-Transformer block进行局部特征提取和跨窗口信息交互(stage 1)。随后,通过patch merging模块逐步合并tokens,减少特征分辨率并提升维度。整个网络由4个阶段(stage 1至stage 4)组成,每一阶段在分辨率逐渐减小的同时提取更深层的语义信息,最终形成多尺度表征[17]。
采用COCO数据集中实例分割的评价指标平均精度(average precision,AP)和平均精度均值(mean average precision,mAP)[30]来评估雪地足迹分割的准确性,并使用帧率(frame rate)评估分割速率。AP是在一定交并比(intersection over union,IoU)阈值下衡量模型的分割性能,反映精确率(precision,P)和召回率(recall,R)的平衡,其中IoU表示掩膜区域与模型预测结果的交并比。精确率表示预测对的正例数量占所有预测为正例数量的比例。召回率表示预测对的正例数量占整体正例的比例,即正例的检出比例。精确率、召回率定义为
XIAOW H, ZHOUQ S, ZHUC D, et al. Advances in techniques and methods of wildlife monitoring[J]. Chinese Journal of Plant Ecology, 2020, 44(4): 409-417.
[3]
罗旭. 野生动物的痕迹识别[J]. 生命世界, 2019(6): 60-67.
[4]
LUOX. Trace recognition of wild animals[J]. Life World, 2019(6): 60-67.
[5]
ALIBHAIS, JEWELLZ, EVANSJ. The challenge of monitoring elusive large carnivores: an accurate and cost-effective tool to identify and sex pumas(Puma concolor) from footprints[J]. PLoS One, 2017, 12(3): e0172065.
[6]
JEWELLZ C, ALIBHAIS K, LAWP R. Censusing and monitoring black rhino(Diceros bicornis) using an objective spoor(footprint) identification technique[J]. Journal of Zoology, 2001, 254(1): 1-16.
[7]
ELBROCHL M, JANSENB D, GRIGIONEM M, et al. Trailing hounds vs foot snares: comparing injuries to pumas Puma concolor captured in Chilean Patagonia[J]. Wildlife Biology, 2013, 19(2): 210-216.
[8]
PIMMS L, ALIBHAIS, BERGLR, et al. Emerging technologies to conserve biodiversity[J]. Trends in Ecology & Evolution, 2015, 30(11): 685-696.
[9]
JEWELLZ C, ALIBHAIS K, WEISEF, et al. Spotting cheetahs: identifying individuals by their footprints[J]. Journal of Visualized Experiments, 2016(111): e54034.
[10]
ALIBHAIS K, JEWELLZ C, LAWP R. A footprint technique to identify white rhino Ceratotherium simum at individual and species levels[J]. Endangered Species Research, 2008, 4: 205-218.
[11]
顾佳音. 东北虎雪地足迹个体识别技术研究[D]. 哈尔滨: 东北林业大学, 2013.
[12]
GUJ Y. Research on Amur tiger(Panthera tigris altaica) individual identification from snow footprints[D]. Harbin: Northeast Forestry University, 2013.
[13]
LIB V, ALIBHAIS, JEWELLZ, et al. Using footprints to identify and sex giant pandas[J]. Biological Conservation, 2018, 218: 83-90.
[14]
ZUERLM, STOLLP, BREHMI, et al. Automated video-based analysis framework for behavior monitoring of individual animals in zoos using deep learning: a study on polar bears[J]. Animals, 2022, 12(6): 692.
WANGW C, JIANGH, QIAOQ, et al. Research on image classification and recognition of ten fish species based on ResNet50 network[J]. Rural Economy and Science-Technology, 2019, 30(19): 60-62.
[17]
NOROUZZADEHM S, NGUYENA, KOSMALAM, et al. Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning[J]. Proceedings of the National Academy of Sciences of the United States of America, 2018, 115(25): E5716-E5725.
[18]
GUOS T, XUP F, MIAOQ G, et al. Automatic identification of individual primates with deep learning techniques[J]. iScience, 2020, 23(8): 101412.
SUNJ X, XIANGF Y, GUANY H, et al. Investigation on mammal and bird diversity in Heilongjiang Xiaobeihu National Nature Reserve by infrared camera[J]. Journal of Anhui Agricultural Sciences, 2023, 51(6): 106-109.
KONGW Y, SUNQ, LIUX X, et al. Population dynamic of Far Eastern leopard(Panthera pardusorientalis) in Wangqing Nature Reserve based on infrared camera monitoring[J]. Scientia Silvae Sinicae, 2019, 55(5): 188-196.
[29]
TORRALBAA, RUSSELLB C, YUENJ. LabelMe: online image annotation and applications[J]. Proceedings of the IEEE, 2010, 98(8): 1467-1484.
[30]
LINT Y, DOLLÁRP, GIRSHICKR, et al. Feature pyramid networks for object detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), July 21-26, 2017. Honolulu: IEEE, 2017: 2117-2125.
[31]
HEK M, GKIOXARIG, DOLLÁRP, et al. Mask R-CNN[C]//2017 IEEE International Conference on Computer Vision(ICCV), October 22-29, 2017. Venice: IEEE, 2017: 2961-2969.
[32]
CHENY P, DAIX Y, LIUM C, et al. Dynamic convolution: attention over convolution kernels[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVRR), June 13-19, 2020. Seattle: IEEE, 2020: 11030-11039.
ZHUANGQ W, WANGZ M, WUL Y, et al. Image segmentation method of plug seedlings based on improved SOLOv2[J]. Journal of Nanjing Agricultural University, 2023, 46(1): 200-209.
[35]
FANGY X, YANGS S, WANGX G, et al. Instances as queries[C]//2021 IEEE/CVF International Conference on Computer Vision(ICCV), October 10-17, 2021. Montreal: IEEE, 2021: 6910-6919.
[36]
LIUZ, MAOH Z, WUC Y, et al. A ConvNet for the 2020s[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), June 13-24, 2022. New Orleans: IEEE, 2022: 11976-11986.
[37]
MEHTAS, RASTEGARIM. Separable self-attention for mobile vision transformers[EB/OL]. arXiv (2022-06-06)[2024-04-09].
[38]
HOWARDA, SANDLERM, CHENB, et al. Searching for MobileNetV3[C]//2019 IEEE/CVF International Conference on Computer Vision(ICCV), Octdoer 27 to November 2, 2019. Seoul: IEEE, 2019: 1314-1324.
[39]
LINT Y, MAIREM, BELONGIES, et al. Microsoft COCO: common objects in context[C]//FLEET D, PAJDLA T, SCHIELE B, et al. Computer vision: ECCV, 2014: lecture notes in computer science, Vol. 8693. Cham: Springer, 2014: 740-755.
ПИКУНОВ Д Г, МИКЕЛЛ Д Г, ДУНИШЕНКО Ю М, et al. A field guide to animal tracks of the Far East[M]. LI B, compile. Harbin: Northeast Forestry University Press, 2008: 49.