基于K-means 聚类和ELM神经网络的养殖水质溶解氧预测

    Dissolved oxygen prediction in water based on K-means clustering and ELM neural network for aquaculture

    • 摘要: 为解决养殖水质溶解氧预测传统方法引入不良样本、精度低等问题,该文以2014、2015年江苏常州养殖基地水质和气象数据为基础,提出了一种基于K-means 聚类和ELM神经网络(extreme learning machine,ELM)的溶解氧预测模型。采用皮尔森相关系数法确定环境因素与溶解氧的相关系数,自定义相似日的统计量-相似度,通过K-means聚类方法将历史日样本划分为若干类,然后分类识别获得与预测日最相似的一类历史日样本集,将其与预测日的实测环境因素作为预测模型的输入样本建立 ELM神经网络溶解氧预测模型。试验结果表明,该模型均具有较快的计算速度和较高的预测精度,在常规天气下,平均绝对百分误差(MAPE)和均方根误差(RMSE)分别达到1.4%、10.8%;在突变天气下,平均绝对百分误差(MAPE)和均方根误差(RMSE)分别达到2.6%和11.6%,有利于水产养殖水质精准调控。

       

      Abstract: Abstract: Dissolved oxygen plays a vital role in water management as it is an important factor that determines the growth status of the fish. Either inadequate or excessive level of dissolved oxygen will be harmful to the survivability of the fish in their respective habitats. The accurate analysis of the data collected from the aquaculture ponds and the prediction for the anticipated level of dissolved oxygen are helpful for both water quality management and aquaculture production. Current studies reveal and understand the complex features of the water quality process mainly from the perspective of mathematical statistics. However, they cannot analyze the effects of changes in the environment on water quality, and cannot do well in dissolved oxygen prediction under the changing environment either. This paper proposed a new strategy to predict dissolved oxygen based on K-means clustering and ELM (extreme learning machine) neural networks. As the curves of similar days showed high correlation of dissolved oxygen, the history samples were divided into several classes to optimize sample space and improve prediction accuracy. After data normalization, the weights of the environmental factors on the dissolved oxygen were determined by Pearson correlation coefficient. The similarity statistics of similar days were improved and defined, which overcame the limitation of Euclidean distance and cosine calculation method. According to the similarity statistics, K-means clustering method was employed to divide the historical samples into several clusters with different daily samples. When the most similar cluster to the forecasting day was identified, the way could reduce the interference between samples and mine the inherent law of the dissolved oxygen data. Then, the ELM neural network of the identified cluster was constructed with the training samples and test data set, and the future amount of dissolved oxygen was predicted with the similar sample set and the real-time environmental factors of the forecasting day as the input data. A total of 23 424 data records of the aquaculture ponds in Wujin, Changzhou, China, were collected and used in the experiments. Taking 5 clusters as the example, ELM neural network was compared with other traditional BP (back propagation) neural networks and SVM (support vector machine). Its prediction accuracy was acceptable, and the running time was only 0.1 s, while that of BP neural network was 10.25 s and that of SVM was slower. It is visible ELM prediction network has a great advantage. Additionally, the caculation speed and prediction efficiency of the model are better than others in terms of the root mean square error (RMSE) and the mean absolute percentage error (MAPE). Experiment results showed that MAPE and RMSE of our prediction method reached 1.4% and 10.8% respectively under normal climate condition. In case of a sudden change of weather, the MAPE and RMSE were 2.6% and 11.6%, respectively. It has higher forecasting accuracy and faster computation speed, which is beneficial to water quality control in aquaculture.

       

    /

    返回文章
    返回