• EI
    • CSA
    • CABI
    • 卓越期刊
    • CA
    • Scopus
    • CSCD
    • 核心期刊

基于语义分割的非结构化田间道路场景识别

孟庆宽, 杨晓霞, 张漫, 关海鸥

孟庆宽, 杨晓霞, 张漫, 关海鸥. 基于语义分割的非结构化田间道路场景识别[J]. 农业工程学报, 2021, 37(22): 152-160. DOI: 10.11975/j.issn.1002-6819.2021.22.017
引用本文: 孟庆宽, 杨晓霞, 张漫, 关海鸥. 基于语义分割的非结构化田间道路场景识别[J]. 农业工程学报, 2021, 37(22): 152-160. DOI: 10.11975/j.issn.1002-6819.2021.22.017
Meng Qingkuan, Yang Xiaoxia, Zhang Man, Guan Haiou. Recognition of unstructured field road scene based on semantic segmentation model[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(22): 152-160. DOI: 10.11975/j.issn.1002-6819.2021.22.017
Citation: Meng Qingkuan, Yang Xiaoxia, Zhang Man, Guan Haiou. Recognition of unstructured field road scene based on semantic segmentation model[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(22): 152-160. DOI: 10.11975/j.issn.1002-6819.2021.22.017

基于语义分割的非结构化田间道路场景识别

基金项目: 国家自然科学基金项目(31571570、62001329);天津市自然科学基金项目(18JCQNJC04500、19JCQNJC01700);天津职业技术师范大学校级预研项目(KJ2009、KYQD1706)

Recognition of unstructured field road scene based on semantic segmentation model

  • 摘要: 环境信息感知是智能农业装备系统自主导航作业的关键技术之一。农业田间道路复杂多变,快速准确地识别可通行区域,辨析障碍物类别,可为农业装备系统高效安全地进行路径规划和决策控制提供依据。该研究以非结构化农业田间道路场景为研究对象,根据环境对象动、静态属性进行类别划分,提出一种基于通道注意力结合多尺度特征融合的轻量化语义分割模型。首先采用Mobilenet V2轻量卷积神经网络提取图像特征,将混合扩张卷积融入特征提取网络最后2个阶段,在保证特征图分辨率的基础上增加感受野并保持信息的连续性与完整性;然后引入通道注意力模块对特征提取网络各阶段特征通道依据重要程度重新标定;最后通过空间金字塔池化模块将多尺度池化特征进行融合,获取更加有效的全局场景上下文信息,增强对复杂道路场景识别的准确性。语义分割试验表明,不同道路环境下本文模型可以对场景对象进行有效识别解析,像素准确率和平均像素准确率分别为94.85%、90.38%,具有准确率高、鲁棒性强的特点。基于相同测试集将该文模型与FCN-8S、SegNet、DeeplabV3+、BiseNet模型进行对比试验,该文模型的平均区域重合度为85.51%,检测速度达到8.19帧/s,参数数量为2.41×106,相比于其他模型具有准确性高、推理速度快、参数量小等优点,能够较好地实现精度与速度的均衡。研究成果可为智能农业装备在非结构化道路环境下安全可靠运行提供技术参考。
    Abstract: Abstract: Environmental information perception has been one of the most important technologies in agricultural automatic navigation tasks, such as plant fertilization, crop disease detection, automatic harvesting, and cultivation. Among them, the complex environment of a field road is characterized by the fuzzy road edge, uneven road surface, and irregular shape. It is necessary to accurately and rapidly identify the passable areas and obstacles when the agricultural machinery makes path planning and decision control. In this study, a lightweight semantic segmentation model was proposed to recognize the unstructured roads in fields using a channel attention mechanism combined with the multi-scale features fusion. Some environmental objects were also classified into 12 categories, including building, person, vehicles, sky, waters, plants, road, soil, pole, sign, coverings, and background, according to the static and dynamic properties. Furthermore, a mobile architecture named MobileNetV2 was adopted to obtain the image feature information, in order to reduce the model parameters for a higher reasoning speed. Specifically, an inverted residual structure with lightweight depth-wise convolutions was utilized to filter the features in the intermediate expansion layer. In addition, the last two stages of the backbone network were combined with the Hybrid Dilated Convolution (HDC), aiming to increase the receptive fields and maintain the resolution of the feature map. The hybrid dilated convolution with the dilation rate of 1, 2, and 3 was used to effectively expand the receptive fields, thereby alleviating the "gridding problem" caused by the standard dilated convolution. A Channel Attention Block (CAB) was also introduced to change the weight of each stage feature, in order to enhance the class consistency. The channel attention block was used to strengthen both the higher and lower level features of each stage for a better prediction. In addition, some errors of semantic segmentation were partially or completely attributed to the contextual relationship. A pyramid pooling module was empirically adopted to fuse three scale feature maps for the global contextual prior. There was the global context information in the first image level, where the feature vector was produced by a global average pooling. The pooled representation was then generated for different locations, where the rest pyramid levels separated the feature maps into different sub-regions. As such, the output of different levels in the pyramid module contained the feature maps with varied sizes, followed by up sampling and concatenation to form the final output. The results showed that the objects in the complex roads were effectively segmented with Pixel Accuracy (PA) and Mean Pixel Accuracy (MPA) of 94.85% and 90.38%, respectively. Furthermore, the single category pixel accuracy of some objects was more than 90%, such as road, plants, building, waters, sky, and soil, indicating a higher accuracy, strong robustness, and excellent generalization. An evaluation was also made to verify the efficiency and superiority of the model, where the mean intersection over union (MIoU), segmentation speed, and parameter scale were adopted as the indexes. The FCN-8S, SegNet, DeeplabV3+ and BiseNet networks were also developed on the same training and test datasets. It was found that the MIoU of the model was 85.51%, indicating a higher accuracy than others. The parameter quantity of the model was 2.41×106, smaller than FCN-8S, SegNet, DeeplabV3+, and BiseNet. In terms of an image with a resolution of 512×512 pixels, the reasoning speed of the model reached 8.19 frames per second, indicating an excellent balance between speed and accuracy. Consequently, the lightweight semantic segmentation model was achieved to accurately and rapidly segment the multiple road scenes in the field environment. The finding can provide a strong technical reference for the safe and reliable operation of intelligent agricultural machinery on unstructured roads.
  • [1] 王泽尤,严铠,任志雨,等. 农业技术进步和农村劳动力转移对农民增收的影响[J]. 农业展望,2020,16(9):20-26.Wang Zeyou, Yan Kai, Ren Zhiyu, et al. Impacts of agricultural technology progress and rural labor force transfer on farmers' income[J]. Agricultural Outlook, 2020, 16(9): 20-26. (in Chinese with English abstract)
    [2] 刘成良,林洪振,李彦明,等. 农业装备智能控制技术研究现状与发展趋势分析[J]. 农业机械学报,2020,51(1):1-18.Liu Chengliang, Lin Hongzhen, Li Yanming, et al. Analysis on status and development trend of intelligent control technology for agricultural equipment[J]. Transactions of the Chinese Society for Agricultural Machinery, 2020, 51(1): 1-18. (in Chinese with English abstract)
    [3] Chattha H S, Zaman Q U, Chang Y K, et al. Variable rate spreader for real-time spot-application of granular fertilizer in wild blueberry[J]. Computers and Electronics in Agriculture, 2014, 100: 70-78.
    [4] Onishi Y, Yoshida T, Kurita H, et al. An automated fruit harvesting robot by using deep learning[C]// Tokyo: The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec), 2018: 6-13.
    [5] 陈建国,李彦明,覃程锦,等. 小麦播种量电容法检测系统设计与试验[J]. 农业工程学报,2018,34(18):51-58.Chen Jianguo, Li Yanming, Qin Chengjin, et al. Design and test of capacitive detection system for wheat seeding quantity[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(18): 51-58. (in Chinese with English abstract)
    [6] 孟庆宽,张漫,杨晓霞,等. 基于轻量卷积结合特征信息融合的玉米幼苗与杂草识别[J]. 农业机械学报,2020,51(12):238-245,303.Meng Qingkuan, Zhang Man, Yang Xiaoxia, et al. Recognition of maize seedling and weed based on light weight convolution and feature fusion[J]. Transactions of the Chinese Society for Agricultural Machinery, 2020, 51(12): 238-245, 303. (in Chinese with English abstract)
    [7] 张漫,季宇寒,李世超,等. 农业机械导航技术研究进展[J]. 农业机械学报,2020,51(4):1-18.Zhang Man, Ji Yuhan, Li Shichao, et al. Research progress of agricultural machinery navigation technology[J]. Transactions of the Chinese Society for Agricultural Machinery, 2020,51(4): 1-18. (in Chinese with English abstract)
    [8] 王荣本,李琳辉,郭烈,等。基于立体视觉的越野环境感知技术[J]. 吉林大学学报:工学版,2008,38(3):520-524.Wang Rongben, Li Linhui, Guo Lie, et al. Stereo vision based cross-country environmental perception technique[J]. Journal of Jilin University: Engineering and Technology Edition, 2008, 38(3): 520-524. (in Chinese with English abstract)
    [9] 汪博. 基于机器视觉的农业导航系统[D]. 杭州:浙江理工大学,2016.Wang Bo. The Agricultural Navigation System Based on Machine Vision[D]. Hangzhou: Zhejiang Sci-Tech University, 2016. (in Chinese with English abstract)
    [10] Coombes M, Eaton W, Chen W H. Colour based semantic image segmentation and classification for unmanned ground operations[C]// International Conference on Unmanned Aircraft Systems (ICUAS). Arlington, VA USA, 2016: 858-867.
    [11] Scharwachter T, Franke U. Low-level fusion of color, texture and depth for robust road scene understanding[C]// 2015 IEEE In Intelligent Vehicles Symposium (IV), 2015, 599-604.
    [12] 陶思然. 顾及梯度和彩色信息的高分辨率影像道路分割[J]. 科学技术与工程,2019,19(31):263-269.Tao Siran. Road segmentation of high-spatial resolution remote sensing images by considering gradient and color information[J]. Science Technology and Engineering, 2019, 19(31): 263-269. (in Chinese with English abstract)
    [13] Duong L T, Nguyen P T, Sipio C D, et al. Automated fruit recognition using EfficientNet and MixNet[J]. Computers and Electronics in Agriculture, 2020, 171: 105326.
    [14] Jiang H, Zhang C, Qiao Y, et al. CNN feature based graph convolutional network for weed and crop recognition in smart farming[J]. Computers and Electronics in Agriculture, 2020, 174: 105450.
    [15] Badrinarayanan V, Kendall A, Cipolla R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495.
    [16] 轩永仓. 基于全卷积神经网络的大田复杂场景图像的语义分割研究[D]. 杨凌:西北农林科技大学,2017.Xuan Yongcang. Research on the Semantic Segmentation of Complex Scene Image of Field Based on Fully Convolutional Networks[D]. Yangling: Northwest A&F University, 2017. (in Chinese with English abstract)
    [17] 李云伍,徐俊杰,刘得雄,等. 基于改进空洞卷积神经网络的丘陵山区田间道路场景识别[J]. 农业工程学报,2019,35(7):150-159.Li Yunwu, Xu Junjie, Liu Dexiong, et al. Field road scene recognition in hilly regions based on improved dilated convolutional networks[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(7): 150-159. (in Chinese with English abstract)
    [18] 张凯航,冀杰,蒋骆,等. 基于SegNet的非结构道路可行驶区域语义分割[J]. 重庆大学学报,2020,43(3):79-87.Zhang Kaihang, Ji Jie, Jiang Luo, et al. The semantic segmentation of driving regions on unstructured road based on signet architecture[J]. Journal of Chongqing University, 2020, 43(3): 79-87. (in Chinese with English abstract)
    [19] 刘家银. 非结构化环境下自主式地面车辆环境感知关键技术研究[D]. 南京:南京理工大学,2018.Liu Jiayin. Research on Key Technologies of Autonomous Land Vehicle Perception in Unstructured Environment[D]. Nanjing: Nanjing University of Science and Technology, 2018. (in Chinese with English abstract)
    [20] Howard A G, Zhu M, Chen B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications[Z]. [2020-07-03], https: //arxiv. org/abs/1704. 04861.
    [21] Sandler M, Howard A, Zhu M, et al. MobileNetV2: Inverted residuals and linear bottlenecks[C]// IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4510-4520.
    [22] Chen L, Papandreou G, Kokkinos I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848.
    [23] Wang P, Chen P, Yuan Y, et al. Understanding convolution for semantic segmentation[C]// 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). Lake Tahoe, 2018: 1451-1460.
    [24] Yu C, Wang J, Peng C, et al. Learning a discriminative feature network for semantic segmentation[C]// 2018 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Salt Lake, UT, USA, 2018, 1857-1866
    [25] Zhao H, Shi J, Qi X, et al. Pyramid scene parsing network[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA, 2017: 6230-6239.
    [26] Liu W, Rabinovich A, Berg A C. Parsenet: Looking wider to see better[C]// In International Conference on Learning Representations, 2016.
    [27] Jadon S. A survey of loss functions for semantic segmentation[C]//2020 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), 2020: 1-7.
  • 期刊类型引用(14)

    1. 黎远江,李云伍,赵颖,台少瑜,王克超. 基于改进Deeplabv3+模型的果树语义分割研究. 中国农机化学报. 2024(01): 209-216 . 百度学术
    2. 侯文慧,周传起,程炎,王玉伟,刘路,秦宽. 基于轻量化U-Net网络的果园垄间路径识别方法. 农业机械学报. 2024(02): 16-27 . 百度学术
    3. 张羽丰,杨景,邓寒冰,周云成,苗腾. 基于RGB和深度双模态的温室番茄图像语义分割模型. 农业工程学报. 2024(02): 295-306 . 本站查看
    4. 宋亮,谷玉海,黄佳伟. 改进SOLOv2的非结构化道路图像实例分割. 激光杂志. 2024(03): 133-139 . 百度学术
    5. 刘毅,陈一丹,高琳,洪姣. 基于多尺度特征融合的轻量化道路提取模型. 浙江大学学报(工学版). 2024(05): 951-959 . 百度学术
    6. 李法霖,石军锋,梁新成,李云伍,刘鹏,陈欣. 基于改进DeepLabV3+的丘陵田间道路图像分割方法研究. 西南大学学报(自然科学版). 2024(08): 172-183 . 百度学术
    7. 金磊,杨晓伟,张浩,杜勇志,李新鹏,戴春田. 基于改进DeepLabv3+与SE注意力机制融合的非结构化道路识别方法. 煤炭工程. 2024(07): 200-204 . 百度学术
    8. 田培忠. 基于ResNet网络层数的露天矿道路识别效果研究. 露天采矿技术. 2024(05): 59-64 . 百度学术
    9. 赵文锋,黄袁爵,钟敏悦,李振源,罗梓涛,黄家俊. 基于OrchardYOLOP的火龙果园多任务视觉感知方法. 农业机械学报. 2024(11): 160-170 . 百度学术
    10. 张彦斐,封子晗,张嘉恒,宫金良,兰玉彬. 基于特征融合的果园非结构化道路识别方法. 农业机械学报. 2023(07): 35-44+67 . 百度学术
    11. 宋彦,杨帅,郑子秋,宁井铭. 基于多头自注意力机制的茶叶采摘点语义分割算法. 农业机械学报. 2023(09): 297-305 . 百度学术
    12. 刘茜,易诗,李立,程兴豪,王铖. 基于轻量级CNN-Transformer混合网络的梯田图像语义分割. 农业工程学报. 2023(13): 171-181 . 本站查看
    13. 金诚谦,刘士坤,陈满,杨腾祥,徐金山. 采用改进U-Net网络的机收大豆质量在线检测. 农业工程学报. 2022(16): 70-80 . 本站查看
    14. 刘天湖,张迪,郑琰,程一丰,裘健,齐龙. 基于改进RRT~*算法的菠萝采收机导航路径规划. 农业工程学报. 2022(23): 20-28 . 本站查看

    其他类型引用(18)

计量
  • 文章访问数:  733
  • HTML全文浏览量:  19
  • PDF下载量:  1438
  • 被引次数: 32
出版历程
  • 收稿日期:  2021-05-31
  • 修回日期:  2021-09-15
  • 发布日期:  2021-11-14

目录

    /

    返回文章
    返回