Abstract:
Soybeans have been one of the most significant crops in food security and agricultural economics in recent years. 3D reconstruction of soybean plants can greatly contribute to agricultural research and breeding. Plant structures and morphology can be utilized to facilitate the selection of superior varieties during breeding. However, traditional 3D reconstruction of soybean plants in field environments cannot fully meet the large-scale production in recent years, due to the high equipment costs of data acquisition and the long duration of reconstruction. In this study, a novel 3D reconstruction was proposed for field soybean plants (termed SfM-INGP), particularly for the low-cost, high-efficiency, and high-quality solution suitable for agricultural settings. Three algorithms were also integrated, including the motion detection adaptive frame extraction, Structure from Motion (SfM), and Instant Neural Graphics Primitives (Instant-NGP). Firstly, the panoramic videos of soybean plants were captured using a consumer-grade smartphone, which was accessible and affordable for most users. An adaptive frame extraction approach was utilized to obtain the multi-view image sequences using motion detection. The redundant data was effectively reduced to enhance computational efficiency. As such, only relevant frames were processed to streamline the reconstruction. Secondly, the SfM algorithm was employed to recover the camera poses from the multi-view images. A sparse point cloud was generated to serve as the critical information for subsequent reconstruction. The final stage involved the use of Instant-NGP. Multi-resolution hash encoding was performed on the images containing pose information. These encoded images were processed through a small multilayer perceptron, particularly for efficient 3D reconstruction. Experimental results indicate that the data acquisition successfully captured the multi-angle video data of soybean plants using only a handheld smartphone. The adaptive frame extraction effectively resolved the inconsistencies in movement speed when using handheld devices in field conditions. Furthermore, there was an average reprojection error of only 1.199 53 pixels, with an average 3D point track length of 5.761 77 during SfM sparse reconstruction. Each image was captured approximately 4 539.7 feature points, in order to successfully recover 100 camera poses with a 100% success rate. The 3D scenes that reconstructed by Instant-NGP closely were resembled the actual soybean plants, thereby maintaining complete and accurate morphology. Key structural features (such as branching) were well-defined without distortion or blurring, while the veins and textures of the leaves were rendered with realistic colors, accurately representing the intricate details of the soybean plant's leaves. In terms of reconstruction efficiency, the SfM-INGP demonstrated an average reconstruction time of just 2.82 min, indicating significant reductions of 90.7% and 99.4%, compared with the traditional Motion-Multi View Stereo (MVS) and Neural Radiance Fields (NeRF), respectively. In reconstruction quality, SfM-INGP was achieved in an average peak signal-to-noise ratio (PSNR) of 24.47 dB, which was improved by 15.4% and 9.3% over MVS and NeRF, respectively. The mean squared error of 0.15 was significantly lower than MVS's 0.46 and NeRF's 0.37. Additionally, the average memory consumption for SfM-INGP was recorded at 6.57 GB, which was slightly higher than MVS's 5.73 GB but considerably lower than NeRF's 14.81 GB. The effectiveness was verified to optimize the reconstruction efficiency, quality, accuracy, and resource consumption. Overall, the finding can provide essential technical support and a robust data foundation for the soybean information breeding platforms through low-cost data acquisition devices in actual agricultural environments. Furthermore, 3D reconstruction can offer a viable solution to large-scale informatization initiatives in modern agriculture.