Abstract:
Obstacles have posed a great challenge to the unmanned working vessel in the crab pond. Once the obstacles appear on the route of the unmanned working vessel, it is easy to collide with the vessel. It is very necessary to quickly identify and locate the obstacles in the crab pond, in order to improve the efficiency and safety of the unmanned working vessel. In this study, an improved YOLOv5s model was proposed to detect the obstacles in the crab pond. The obstacles were then located to combine with the depth camera. Firstly, the lightweight network ShuffleNetV2 was used as the backbone feature extraction network, and the depth-separable convolution and channel mixing strategies were adopted to greatly reduce the model volume, in order to accelerate the detection speed for high accuracy. Secondly, the SE attention mechanism was introduced to enhance the feature perception of obstacles in the crab pond, particularly without increasing the amount of computation. Thirdly, the SPPF module was improved into the SPPFCSPC module. The target features were extracted under different receptive fields, in order to enhance the detection of obstacles in crab ponds at different scales. Finally, the SIoU loss function was adopted to further accelerate the model convergence for high accuracy. According to the color images obtained by the RealSense D435i depth camera, the obstacle was detected to locate the pixel coordinates of the center point of the obstacle in the coordinate system of the crab pond using coordinate conversion. The width of the obstacle was obtained simultaneously. The experimental results showed that the better recognition of the improved model was achieved in the pole, tarp, and aerator, where the confusion matrix output was observed by the model on the test set. The number of parameters and calculation amount of the improved YOLOv5s model were reduced by about 62.8% and 80.0% respectively, compared with the original. The model size was only 5.5 MB, and the volume was reduced by about 61.8%. The detection speed increased by 44.5%, and the inference speed of a single image reached 15.2 ms, where the mean average precision reached 93.3% for obstacles in the crab pond. Compared with the YOLOv5s-MobileNetV2, YOLOv5s-GhostNet, YOLOv7 and YOLOv8, the improved model had maintained the best balance in terms of parameter number, computation amount, detection speed and detection accuracy. A series of positioning and width measurement tests were conducted to verify the location accuracy of the improved model. Three typical obstacles were located in a crab pond at the fishery Science and Technology Demonstration base in Jintan District, Changzhou, Jiangsu Province. The test results showed that the average absolute error and average relative error of the distance between the three types of obstacles and the camera were 0.16 m and 2.26% in the range of 2-10 m, respectively, and the maximum absolute error and maximum relative error were 0.35 m and 3.74%, respectively. The width measurement errors of the pole, crab trap and aerator were concentrated in 0-0.05 m,-0.05-0.13 m and 0.01-0.19 m, respectively. The average errors were 0.03, 0.05 and 0.10 m, respectively, and the average relative errors were 34.1%, 7.5% and 5.0%, respectively. Finally, the depth camera was fixed in the front of the unmanned working vessel, in order to detect the three types of obstacles in the crab pond several times. The detection and positioning effects of the model were verified in the actual crab pond environment. The improved model was used to successfully detect the obstacles, in order to export the category, confidence, three-dimensional coordinate and width information. In summary, the improved model can fully meet the requirements to detect and locate the common obstacles in the crab pond environment. The finding can provide an important reference for the autonomous obstacle avoidance and cruise operation of the unmanned working vessel.