Abstract:
Insect electrical penetration graph (EPG) technology has been widely applied in researching the feeding behavior of piercing-sucking insects, the relationship between insects and plants, insect transmission mechanism and crop resistance mechanism. However, the identification and analysis of EPG signals have been carried out manually, it is urgent to develop the automatic identification system of EPG waveforms to improve the efficiency. EPG waveforms produced by piercing-sucking insects are related to the insects and plant species, and the EPG waveforms of different types of piercing sucking insects vary greatly, and even the same type of EPG waveform has different amplitude and frequency, which brings difficulties to machine recognition of EPG waveform. EPG waveform is a time series, and its irregularity can be described by fractal theory, fractal theory can reveal the similarity of local part with the whole of the EPG waveform in a certain aspect, the fractal dimension (FD) of the EPG waveform can reflect the characteristic change and the complexity of the geometric shape. EPG waveform belongs to the bioelectrical signal and is nonlinear and non-stationary in nature. Hilbert-Huang transform (HHT) is a powerful tool for analyzing time-varying non-stationary signals, it decomposes the nonlinear signal into several single-mode signals, and adaptively selects the transforming substrate according to the signal itself, so that the bioelectrical signal can be decomposed in essence. In this paper, the EPG signals of aphid were taken as the research object, the feature extraction and classification of np, pd, E1, E2, G, C and F waveform were studied. An EPG waveform recognition method based on fractal dimension, HHT and decision tree was proposed. Firstly, the signals collected by the EPG instrument were denoised and preprocessed, then the features of fractal dimension and HHT were extracted respectively, and the different dimensions vectors were put into the decision tree classifier for comparative experiments, decision tree was used as a classifier, which was generated by C4.5 algorithm. In the process of constructing decision tree, there were 2 main steps: one was to select attribute by information gain ratio, and the other was to complete classification by post-pruning method. In machine recognition of EPG waveform, six-dimensional feature vectors were used as input signals, and 4 groups of samples were tested. The experimental results showed that the six-dimensional feature vectors with fractal box dimension, hurst exponent, spectral centroid and weighted frequency of the first 2 layers had the highest recognition rate. After 10 steps of pruning, the decision tree completed classification, and the recognition rates of the 4 tested groups were 92.14%, 89.29%, 95% and 89.29% respectively. By analyzing the confusion matrix of the 4 groups of test data, it could be seen that the np, E1 and G waveform could be accurately identified, the recognition rate of E2 and C waveform was low, which was prone to misjudgment, this was because that there was no obvious difference between the extracted characteristic values (such as box dimension, spectral centroid of the first 2 layers and weighted frequency of the second layer), C waveform was the most complex of all waveforms, which usually containing A, B waveform and some unrecognizable waveform, and was easy to be confused with other waveforms. The same test samples used for machine recognition were adopted in manual classification. The experimental results showed that the average recognition rate of artificial recognition was 99.11%, the average recognition rate for machine recognition was 91.43%, which was lower than the artificial recognition by 7.68 percent point, average time of the machine recognition was 18.22 s, which was only about 1/46 of that of artificial recognition 839.13 s. The proposed feature extraction method based on fractal dimension and HHT and the constructed decision tree classifier were feasible, which provided a theoretical reference for the research and development of EPG signals automatic identification and analysis system. This research can shorten the analysis time of EPG signal, accelerate the progress of scientific research, and promote the efficient use and intelligent development of EPG.