To solve the problems of slow part identification and low success rate in grabbing mechanical parts by industrial robots, an intelligent part identification and grabbing method was proposed based on SGV-YOLOv8 model. The monocular camera and laser ranging module were used to build a depth vision detection device to realize the three-dimensional positioning of mechanical parts; Taking the YOLOv8 model as the basic architecture, StarNet network was used in the backbone network to replace the original structure, and GSConv module and VoV-GSCSP structure were introduced in the neck, so as to reduce the complexity of the model and improve the detection speed and capture rate. The experimental results show that compared with the original model, the number of model parameters and the number of floating point operations (GFLOPs) of the designed SGV-YOLOv8 increases 51.9% and 51% respectively, while the number of detection frames per second (FPS) increases 37.6%; The success rate of part grasping in the constructed industrial robot grasping devices is 80%.
本实验进行网络训练时使用的硬件配置为Windows Server 2022的操作系统和Intel(R) Xeon(R) Gold 5118 CPU @ 2.30 GHz处理器,内存为384 GB,GPU为NVIDIA RTX 3090。网络的编译语言使用Python 3.8,编译平台为Visual Studio 2022,并且使用PyTorch2.1.0作为网络的深度学习框架,安装CUDA12.1进行加速。模型的超参数配置见表1。
3.2.2 评价指标
本研究旨在几乎不降低检测精度的前提下减少模型的计算量和检测时间,从而减少对硬件的需求,故本实验选择了以下6个指标对模型进行评估:参数规模、浮点运算数(GFLOPs)、精度(precision)、召回率(recall)、平均精度(mAP@0.5)、每秒帧数(frames per second, FPS),具体公式略。FPS值、参数规模和浮点运算是衡量网络大小的重要评价指标。FPS值越高、参数规模和浮点运算越小,则网络对硬件的要求越低,更适合实际应用,且能及时检测。
NAYiming, HUChao, QIUYeyu, et al. Three-dimensional Positioning Guidance of Automobile Doors Based on Machine Vision [J]. China Mechanical Engineering, 2024, 35(9): 1677-1687.
[5]
NAKAGUCHIV M, LIUZifu, et al. 3D Camera and Single-point Laser Sensor Integration for Apple Localization in Spindle-type Orchard Systems[J]. Sensors, 2024, 24(12): 3753.
[6]
LUHMANNT, FRASERC, MAASH G. Sensor Modelling and Camera Calibration for Close-range Photogrammetry[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2016, 115: 37-46.
[7]
LIUZewei, LUDongming, QIANWeixian, et al. Calibration of a Single-point Laser Range Finder and a Camera[J]. Optical and Quantum Electronics, 2018, 50(12): 447.
[8]
PATELS N, REKIMOTOJ, ABOWDG D. ICam: Precise At-a-distance Interaction in the Physical Environment[C]∥Pervasive Computing. Berlin, 2006: 272-287.
[9]
WITHERJ, COFFINC, VENTURAJ, et al. Fast Annotation and Modeling with a Single-point Laser Range Finder[C]∥2008 7th IEEE/ACM International Symposium on Mixed and Augmented Reality. Cambridge, 2008: 65-68.
ZhangchengLYU, ZHANGJianye, CHENZheyao, et al. Real-time Detection Algorithm for Industrial Parts Recognition and Grabbing Based on Deep Learning [J]. Machine Tool & Hydraulics, 2023, 51(24): 33-38.
[12]
HINTONG E, SALAKHUTDINOVR R. Reducing the Dimensionality of Data with Neural Networks[J]. Science, 2006, 313(5786): 504-507.
[13]
GIRSHICKR, DONAHUEJ, DARRELLT, et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation[C]∥2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, 2014: 580-587.
[14]
HEKaiming, ZHANGXiangyu, RENShaoqing, et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition[C]∥Computer Vision – ECCV 2014. Cham, 2014: 346-361.
[15]
GIRSHICKR. Fast R-CNN[C]∥2015 IEEE International Conference on Computer Vision (ICCV). Santiago, 2015: 1440-1448.
[16]
RENShaoqing, HEKaiming, GIRSHICKR, et al. Faster R-CNN: Towards Real-time Object Detection with Region Proposal Networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
[17]
DAIJ, LIY, HEK, et al. R⁃FCN: Object Detection via Region⁃based Fully Convolutional Network[C]∥30th Conference on Neural Information Processing Systems. Barcelona, 2016:379-387.
LIZhou, HUANGMiaohua. Vehicle Detections Based on YOLO_v2 in Real-time [J]. China Mechanical Engineering, 2018, 29(15): 1869-1874.
[20]
REDMONJ, DIVVALAS, GIRSHICKR, et al. You Only Look Once: Unified, Real-time Object Detection[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, 2016: 779-788.
[21]
LIUWei, ANGUELOVD, ERHAND, et al. SSD: Single Shot MultiBox Detector[C]∥Computer Vision–ECCV 2016. Cham, 2016: 21-37.
[22]
MAXu, DAIXiyang, BAIYue, et al. Rewrite the Stars[C]∥2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, 2024: 5694-5703.
[23]
LIH, LIJ, WEIH, et al. Slim-neck by GSConv: a Better Design Paradigm of Detector Architectures for Autonomous Vehicles[J]. arXiv Preprint arXiv:2022.