温长吉, 娄月, 张笑然, 杨策, 刘淑艳, 于合龙. 基于改进稠密胶囊网络模型的植物识别方法[J]. 农业工程学报, 2020, 36(8): 143-155. DOI: 10.11975/j.issn.1002-6819.2020.08.018
    引用本文: 温长吉, 娄月, 张笑然, 杨策, 刘淑艳, 于合龙. 基于改进稠密胶囊网络模型的植物识别方法[J]. 农业工程学报, 2020, 36(8): 143-155. DOI: 10.11975/j.issn.1002-6819.2020.08.018
    Wen Changji, Lou Yue, Zhang Xiaoran, Yang Ce, Liu Shuyan, Yu Helong. Plant recognition method based on a improved dense CapsNet[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2020, 36(8): 143-155. DOI: 10.11975/j.issn.1002-6819.2020.08.018
    Citation: Wen Changji, Lou Yue, Zhang Xiaoran, Yang Ce, Liu Shuyan, Yu Helong. Plant recognition method based on a improved dense CapsNet[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2020, 36(8): 143-155. DOI: 10.11975/j.issn.1002-6819.2020.08.018

    基于改进稠密胶囊网络模型的植物识别方法

    Plant recognition method based on a improved dense CapsNet

    • 摘要: 植物识别意义重大,但是由于植物种类繁多,规模数据集标注和构建困难,因此植物物种识别作为精细分类任务仍然面临巨大挑战。该研究提出一种改进稠密胶囊网络模型用于植物物种识别。首先,在网络初始端引入自注意力层,通过增加特征图中待识别区域的特征权值以降低背景信息对于识别任务的干扰。其次,在改进模型胶囊层间使用局部约束动态路由算法,实现局部区域内胶囊路由选择和转换矩阵共享机制,降低网络参数规模,减小网络训练学习计算负载。在试验数据集上计算结果表明,当输入图片尺度为32×32像素时,该研究模型平均识别准确率为77.2%,参数规模仅为1.8 M。当输入图片尺度为227×227像素时,该研究模型平均识别准确率为95.1%,参数规模仅为5.2 M。试验结果表明提出的改进稠密胶囊网络模型在识别分类和降低模型参数规模上均有大幅提升。

       

      Abstract: The recognition of plant and other biological species is of great significance in maintaining plant species diversity, understanding plant growth characteristics and geographical distribution, constructing a biodiversity database, and realizing the rational development and utilization of plant resources. But plant recognition and classification are still very challenging tasks. In this study, the classical capsule network and its modified models were applied to the fined classification task of plant species recognition. Based on the idea of DCNet, a modified dense capsule network was proposed. Firstly, the self-attention mechanism was introduced as the network layer. By this method, the interference background information to the recognition task was reduced by assigning the high weight value of the target feature. Secondly, the locally-constraint dynamic routing algorithm was used between the capsule layers in the modified-DCNet. By sharing the transformation matrix in the predefined local grid, it reduced the load of network parameter calculation and adapted to the small sample datasets for training and learning. To verify the model of this study, three datasets were used, Oxford Flower datasets, the Normal flower datasets in Northeast China and ImageCLEF 2013 leaf datasets. Oxford Flower dataset was an open-source flower dataset consisting of common 17 types of flowers in the UK proposed by the machine vision research group of Oxford University. Every category contains 80 images. There was a total of 1 360 images. The changes in individual morphology, light, and proportion of the images were used to ensure the diversity of the samples. And the differences between some individual categories were small. The Normal flower dataset in Northeast China was a self-built dataset for this study. The dataset was composed of common flowers in Northeast China in which were 15 categories and a total of 1 360 images. The pictures were taken on the spot in suburbs, parks and flower breeding bases under sunlight condition. The images were marked and confirmed by experts. ImageCLEF 2013 leaf dataset was supported by INRIA and CIRAD. The main species were obtained in the Mediterranean region of France. There were 15 kinds of leaves, in a total of 1 125 plant leaves. The collection method of sample images included leaf scanning and taking pictures outdoors. The comparative experimental results showed that the average recognition accuracy of the Modified-DCNet proposed in this study was 77.2% on the three datasets when the input image scale was 32 × 32 pixels. Compared with CapsNet, DCNet, and VGG16, the average recognition accuracy improved by 18.8%, 12.7%, and 25.2%, respectively. The parameter size was only about 1.6 M which was only 1.3% of VGG16. When the input image scale was 227×227 pixels, the average recognition accuracy of this model was 95.1%. The average recognition accuracy was improved by 25.5% and 8.6% compared with AlexNet and VGG16, respectively. In this study, the model parameter size was 5.2 M which was only 8.6% of AlexNet and 3.7% of VGG16. Under the same conditions, the experimental results showed that the performance of these models was improved compared to AlexNet, VGG16, CapsNet, and DCNet. By using the locally-constrained dynamic routing algorithm, the scale of this model parameters was greatly reduced, which was more suitable for large-scale image classification and recognition. From the experimental results, when the input image was 227 × 227 pixels, the model parameter size was only 1.1% of CapsNet, and 1.3% of DCNet. When the input image was 32 × 32 pixels, these models were only 21.9% of CapsNet, and 26% of DCNet. The larger the image size was the more the improvement of the scale. Meanwhile, larger images often had more information, so the recognition accuracy was higher. Furtherly, the experimental results on three datasets showed that the highest recognition accuracy on the ImageCLEF 2013 leaf dataset was 97.2%. In this way, low sample complexity led to a high recognition rate. At the same time, through analyzing the results of the experiments in this study, the main distinctive features among flower datasets were color features, following by morphological features. When the color and morphological features of a certain type of a dataset were relatively monotonic, the recognition accuracy was higher.

       

    /

    返回文章
    返回