基于病害高发期气象因子的三七病害发生率预测

    Prediction of Panax notoginseng incidence rate based on meteorological factors in the high disease incidence period

    • 摘要: 准确预报病害发生率是提前应对三七病害、提高产量和品质的重要基础。该研究利用2018-2019年云南红河州三七种植基地内田间气象数据和病害发生率资料,采用主效应分析(Principal Components Analysis, PCA)来避免多元共线性的发生。以2018年和2019年的5-9月气象数据集作为训练集与验证集,以随机森林(Random Forest, RF)算法作为基础学习机构建初步预测模型,最后通过梯度下降(Gradient Descent, GD)算法进行优化。结果表明,土壤温度与棚内湿度均与病害发生率呈正相关,其皮尔逊相关系数在0.25~0.75之间,棚内土壤热通量和三七冠层上方土壤热通量均与病害发生率呈负相关,其皮尔逊相关系数在-0.75~-0.25之间;通过随机森林获得的模型的均方根误差为0.23;通过梯度下降优化,代价函数收敛时值为241.003,并获得各个气象因子对三七病害高发期的病害发生率影响的权重,其中土壤温度正相关程度最大,权重为21.686,三七冠层上方的土壤热通量负相关程度最大,权重为-13.834。该研究结果在通过田间气象因子预测三七病害高发期的病害发生率上具备可靠的预测能力,可为降低三七病害的设施环境调控和智能化管理提供理论依据和技术支持。

       

      Abstract: Abstract: Predicting the incidence rate accurately is an important basis for responding to Panax notoginseng disease in advance and improving yield and quality. The study used field meteorological data and incidence data in the Panax notoginseng planting base in Honghe prefecture, Yunnan province from 2018 to 2019, and used the Principal Components Analysis (PCA) to avoid the occurrence of multiple collinearities. The weather data set from May to September each year was used as the training set validation set, and the Random Forest (RF) algorithm was used as the basic learning machine to construct the preliminary prediction model, and finally, the Gradient Descent (GD) algorithm was used for optimization. The results showed that 1) The incidence of Panax notoginseng disease in the high-incidence period was mainly related to soil temperature, humidity in the shed, and soil heat flux in the shed and above the canopy. The PCA avoided the problem of the multicollinearity and obtained the Pearson correlation coefficient between the indicators, among them, the soil temperature and humidity in the shed were positively related to the incidence rate, and their Pearson correlation coefficient were both between 0.25 and 0.75; the soil heat flux in the shed and the soil heat flux above Panax notoginseng canopy were negatively correlated with the incidence rate, and their Pearson correlation coefficient were both between -0.75 and -0.25. 2) Random forest predicted that the frequency of 35% of the incidence rate in the high-incidence period was relatively low, while the frequency of the incidence rate was between 60% and 80%. The phenomenon of infecting other plants at an exponential growth rate was consistent, and all fall within the confidence interval. The root mean square error value of the evaluation index used by random forest was 0.230, and the prediction effect could be trusted. 3) Through GD optimization, the cost function convergence time value was 241.03, the difference between the predicted incidence rate of Panax notoginseng and the actual incidence rate was 1.5%, and the weight of the impact of each meteorological factor on the incidence rate of Panax notoginseng disease in the high-incidence period was obtained. Where the maximum degree of the positive correlation between soil temperature, weight was 21.686, soil heat flux thirty-seven canopy above the negative correlation between the degree of the largest weight was -13.834. 4) Regarding the impact of various meteorological factors on the incidence rate of the Panax notoginseng disease in the high incidence period, the final prediction model was compared with the PCA obtained from the main effect analysis, and the analysis results of the two were consistent. The research results have reliable predictive capabilities in disease prediction, could provide theoretical basis and technical support for facility environmental regulation and intelligent management to reduce Panax notoginseng disease.

       

    /

    返回文章
    返回