毕淑慧, 李雪, 申涛, 徐元, 马荔瑶. 基于多模型证据融合的苹果分类方法[J]. 农业工程学报, 2022, 38(13): 141-149. DOI: 10.11975/j.issn.1002-6819.2022.13.016
    引用本文: 毕淑慧, 李雪, 申涛, 徐元, 马荔瑶. 基于多模型证据融合的苹果分类方法[J]. 农业工程学报, 2022, 38(13): 141-149. DOI: 10.11975/j.issn.1002-6819.2022.13.016
    Bi Shuhui, Li Xue, Shen Tao, Xu Yuan, Ma Liyao. Apple classification based on evidence theory and multiple models[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(13): 141-149. DOI: 10.11975/j.issn.1002-6819.2022.13.016
    Citation: Bi Shuhui, Li Xue, Shen Tao, Xu Yuan, Ma Liyao. Apple classification based on evidence theory and multiple models[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(13): 141-149. DOI: 10.11975/j.issn.1002-6819.2022.13.016

    基于多模型证据融合的苹果分类方法

    Apple classification based on evidence theory and multiple models

    • 摘要: 水果内部品质是水果分类的重要依据之一,利用近红外光谱技术对苹果内部品质进行快速无损检测研究有着非常重要的意义。为提高近红外技术分类模型的预测精度,针对单一预测模型适用性差以及硬分割导致分类不确定性等问题,该研究以烟台红富士苹果为研究对象,利用自行研发的水果在线无损检测系统采集苹果近红外光谱以及可溶性固形物含量 (Soluble Solids Content, SSC),分别采用偏最小二乘(Partial Least Squares, PLS)和极限学习机(Extreme Learning Machine, ELM)法建立苹果预测分类模型,根据SSC的预测值与分类边界的距离提出三角形质量函数生成方法,通过证据理论的Dempster组合规则融合质量函数从而实现2种模型的融合,并探讨基于三角形质量函数的证据理论融合模型对预测精度的影响。研究结果表明:PLS分类模型的准确率为92.25%,ELM分类模型的准确率为 93.80%,而提出多模型融合方法的分类准确率达到了95.35%。而且,该研究提出的三角形质量函数生成法与硬分割生成的质量函数相比方法更符合实际,通过PLS、ELM模型和DS融合模型的混淆矩阵可以看出,融合模型实现了苹果SSC处于分类边界值时的准确分类,三类苹果被错误分类的个数均有减小。该研究提出的多模型证据融合方法不仅提高了模型的预测精度,而且更好地表达了关于类标预测的不确定性,为苹果的在线无损检测分类提供研究基础。

       

      Abstract: Soluble solid content (SSC) has been widely used to realize the prediction of apple grades in recent years. It is very necessary to establish a non-destructive prediction model for the mapping from the near-infrared spectra to the SSC. However, the single classification model has seriously limited the prediction accuracy and application, particularly with the hard partition of the instance space. In this study, an evidence theory-based multi-model fusion was proposed to deal with this issue. The evidence theory provides a flexible framework to represent and reason with uncertainty. After that, the evidence theory was adopted to aggregate the predictions of the two models. Moreover, the mass function generation was also treated for the fusion model. The classification knowledge was then better provided by the SSC, Therefore, a much more accurate classification was achieved using the improved model. Some processes were firstly carried out to improve the prediction performance, including the information collection, data pre-processing, and spectral feature selection. Specifically, the near-infrared spectra of apple instances were collected by the self-developed WY-6100 fruit online nondestructive testing system. Meanwhile, the SSC was measured using physicochemical technologies. 439 Red Fuji apple instances were collected from Yantai City, Shandong Province, China. The training and test set were then randomly divided by the 7:3 ratio. Specifically, the 307 randomly-selected instances were used for the model training, whereas, the rest 132 instances were used for the performance testing. A variety of procedures were selected to pre-treat the near-infrared spectra, in order to achieve the inputs of instances. Some anomalous instances were also eliminated using Principal Component Analysis-Mahalanobis distance (PCA-MD). A Savitzky-Golay convolutional smoothing filtering was used to remove the noise caused by the equipment, environment, and external factors. The degree of baseline drift was reduced by the standard normal variable transformations. The characteristic wavelength was then extracted from the spectrum by the genetic algorithm (GA), in order to preserve the most useful information. Partial least squares (PLS) and Extreme learning machine (ELM) models were also established in this paper, with classification accuracy of 93.80% and 92.25% respectively, as shown by the experimental data. Two prediction models were fused by the Evidence theory. The uncertainty of each prediction model needed to be quantified with a mass function, particularly for the higher classification performance of the fusion model. A triangular mass function generation was proposed to balance the distance between the predicted value of SSC and the classification boundary. The misclassifications were attributed to the apple instances, where the predicted values of SSC fell within the area near the boundary of apple classification. Thus, a novel strategy was used to assign the value of masses into a precise class, where this class was set to contain the adjacent one. Once the function was generated, the mass functions of ELM and PLS models were combined to obtain the fusion prediction using Dempster's combination. The focal element with the maximum mass value in the combined mass function was selected as the final decision of apple grades. Finally, the experimental results showed that the triangular mass function generation was more reasonable than that using the hard partition. The classification accuracy of the multi-model based fusion reached 95.35%. In summary, the mass function generation was modified to more precisely depict the Evidence theory-based fusion model for the uncertain classification information. Therefore, a much more accurate and intuitive classification was achieved during this time. Moreover, the improvement can also be applied to the fusion of other prediction models. The better applicability was also obtained without the hard partition, compared with the single prediction model.

       

    /

    返回文章
    返回