2016年,单阶段检测算法YOLO(you only look once)由Redmond等[9]开发,该算法采用单一回归问题思想,将整张图片作为输入,直接在输出层对目标位置和类别回归.由于YOLO模型端对端的结构特点,该算法在生成目标预选框的同时完成了分类,大大减少了时间成本和计算资源.YOLOv5是当前YOLO系列最新的目标检测算法,根据不同的深度和宽度,YOLOv5有4个不同版本,依次为YOLOv5s、YOLOv5m、YOLOv5l、YOLOv5x,其继承了前代优势的同时,也优化主干网络,使目标检测的精度有所提高.
GUITARB. Stuttering: an integrated approach to its nature and treatment [M]. Baltimore, MD: Williams& Wilkins, 1998.
[2]
SALTUKLAROGLUT, KALINOWSKIJ. How effective is therapy for childhood stuttering? Dissecting and reinterpreting the evidence in light of spontaneous recovery rates[J]. International Journal of Language & Communication Disorders, 2005, 40(3):359-374.
[3]
HOWELLP, HAMILTONA, KYRIACOPOULOSA.Automatic recognition of repetitions and prolongations in stuttered speech[C]//Proceedings of the First World Congress on Fluency Disorders.Nijmegen, The Netherlands: University Press Nijmegen, 1995, 2: 372-374.
[4]
RAVIKUMARK M, KUDVAS, RAJAGOPALR, et al. Development of a procedure for the automatic recognition of disfluencies in the speech of people who stutter[C]//International Conference on Advanced Computing Technologies, Hyderbad, India. 2008: 514-519.
[5]
CHEEL S, AIO C, YAACOBS. Overview of automatic stuttering recognition system[C]//Proc. International Conference on Man-Machine Systems, no. October, Batu Ferringhi, Penang Malaysia. 2009: 1-6.
[6]
ALHARBIS, HASANM, SIMONSA J H, et al. A lightly supervised approach to detect stuttering in children's speech[C]//Proceedings of Interspeech 2018. ISCA, 2018: 3433-3437.
[7]
KOURKOUNAKIST, HAJAVIA, ETEMADA. Detecting multiple speech disfluencies using a deep residual network with bidirectional long short-term memory[C]//IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020: 6089-6093.
[8]
AMIRO, SHAPIRAY, MICKL. The speech efficiency score (SES): A time-domain measure of speech fluency[J]. Journal of fluency disorders, 2018, 58: 61-69.
[9]
REDMONJ, DIVVALAS, GIRSHICKR, et al. You only look once: unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 779-788.
[10]
WANGQ, WUB, ZHUP,et al .ECA-Net: efficient channel attention for deep convolutional neural networks[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2020: 11534-11542.
[11]
REZATOFIGHIH, TSOIN, GWAKJ Y, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 658-666.
[12]
ZHENGZ, WANGP, LIUW, et al.Distance-IoU Loss: faster and better learning for bounding box regression[J].Proceedings of the AAAI Conference on Artificial Intelligence, 2020,34(7):12993-13000.
[13]
PETERH, STEPHEND,JON B.The university college London archive of stuttered speech (UCLASS).[J].Journal of speech, language, and hearing research : JSLHR,2009,52(2):556-569.
[14]
American Speech-Language-Hearing Association. Childhood fluency disorders [EB/OL]. 2020/2021-10-19.