Abstract
Abstract: In order to solve the problems of storing large amounts of structured and unstructured data, such as images, video and sensors and so on, and real-time processing and writing, a data cloud storage system for farmland Internet of Things (IOT) with mass storage, high performance and easy expansion is constructed. Based on Hadoop platform, in this paper, we constructed a massive farmland IOT data cloud storage system by combining the characteristics of farmland IOT data, using distributed storage and NoSQL(Not Only SQL) technology. From the security, reliability, efficient reading and writing, data conversion, transaction processing, small file processing, cache strategy, load balancing and other issues of the system, HDFS was used to store unstructured data such as pictures and videos in the farmland IOT system, HBase was used to store structured data such as meteorology and moisture in the farmland IOT system, Redis was used in cache servers. Three layers of data cloud storage architecture for IOT were designed. The system classifies and processes video, image, text and structured data. For large video block storage, small file image packaging and merging storage, text classification and conversion strategy, unstructured data were written to HDFS, structured data were written to HBase, and Redis was used as the system cache to realize the data of the IOT writing and reading business. In distributed cluster environment, the reliability of cross-line transaction and long transaction processing was restricted. It was difficult to process cross-line transaction and long transaction accurately and orderly, and it was difficult to ensure data consistency in complex services such as massive data analysis. In this paper, a distributed transaction mechanism based on optimistic lock was designed. The transaction processing module cooperates with the HLock(optimistic lock) structure to control the state of the transaction. The NTP server guarantees the uniqueness of the transaction timestamp. The transaction ACID features, including reading and writing data, were solved. HBase's strong transactional support has been tested to improve query efficiency by 35.75% compared with traditional MySQL clusters when the data level was 5 million. Thus, NoSQL-based structured data storage scheme was feasible in dealing with high concurrent massive data scenarios. In order to solve the problem of a large number of small pictures and small files in the farmland IOT, the sampled pictures were packaged and measured. The "SequenceFile" technology was used to merge multiple pictures into a "Block" to realize the strategy of merging and storing small files. Fast index reading, compared with the original HDFS storage reading and writing efficiency, image file storage reading and writing efficiency improved by more than 30%. Therefore, based on the "SequenceFile" file merging technology, image file name design and index optimization strategy, it was suitable for large-scale image storage scene in the farmland IOT. The system had been applied to the monitoring system of farmland IOT in China Henan Province. It was distributed in more than 60 monitoring stations in Changge, Huaxian, Luohe and Fangcheng counties and cities, providing real-time data for storage, management and visualization, and considering the incorporation of more sensors and monitoring stations, the system was in good working order. In summary, based on Hadoop platform and NoSQL technology, we designed a massive farmland IOT data storage model, designed and implements the key technologies such as data reading and writing, transaction, picture packaging, index, load balancing module, and develops a massive farmland IOT data storage, management system. Based on NoSQL massive farmland IOT data storage scheme suitable for the storage and management needs of the IOT massive, real-time data, for farmland IOT storage transaction consistency, small file processing and other issues, for massive agricultural IOT data storage solutions. It can combine distributed computing and machine learning technology to compute the data of IOT in real time and provide real-time operation and decision-making services for agricultural production.