基于SfM与Instant-NGP的田间大豆植株三维重建方法

    Three-demensional reconstruction of soybean plants in the field based on SfM and Instant-NGP

    • 摘要: 针对以往田间环境下大豆植株三维重建过程中存在的数据采集设备成本较高、重建阶段耗时较长等问题,该研究提出一种基于运动检测自适应抽帧、运动恢复结构(structure from motion,SfM)与即时神经图形原语(instant neural graphics primitives,Instant-NGP)的田间大豆植株三维重建方法SfM-INGP。该方法旨在提供一套低成本、高效率且高质量的田间大豆植株三维重建方案。首先,在田间环境下利用消费级智能手机环绕拍摄大豆植株全景视频,并基于运动检测的自适应速率抽帧方式获取大豆植株多视角图像序列,以减少冗余数据并提升计算效率;其次,利用SfM算法从多视角图像中恢复相机位姿,生成稀疏点云,为后续重建提供位姿信息;最后,通过Instant-NGP算法对含有位姿信息的大豆植株图像进行多分辨率哈希编码,将其输入小型多层感知机进行训练,以完成高效的三维重建。试验结果表明,在重建效率方面,与多视图立体视觉(motion-multi view stereo,MVS)和神经辐射场(neural radiance fields, NeRF)方法相比,SfM-INGP的平均重建时间为2.82 min,分别大幅缩短90.7%和99.4%;在重建质量方面,SfM-INGP的平均峰值信噪比为24.47 dB,较MVS和NeRF分别有效提高15.4%和9.3%;在重建精度方面,SfM-INGP的均方误差为0.15,显著低于MVS的0.46和NeRF的0.37;在计算资源消耗方面,SfM-INGP的平均显存消耗为6.57 GB,虽略高于MVS的5.73 GB,但远低于NeRF的14.81 GB,展现了SfM-INGP方法在重建效率、质量、精度与计算资源消耗之间的良好平衡。该研究提出的方法在实际农业田间环境下通过低成本的数据采集设备,实现了高效率且高质量的田间大豆植株三维重建,为大豆信息化育种平台建设提供了重要的技术支持和数据基础,在其他田间作物的三维重建中具有广泛应用潜力,为未来大规模农业信息化建设提供可行性方案。

       

      Abstract: Soybeans have been one of the most significant crops in food security and agricultural economics in recent years. 3D reconstruction of soybean plants can greatly contribute to agricultural research and breeding. Plant structures and morphology can be utilized to facilitate the selection of superior varieties during breeding. However, traditional 3D reconstruction of soybean plants in field environments cannot fully meet the large-scale production in recent years, due to the high equipment costs of data acquisition and the long duration of reconstruction. In this study, a novel 3D reconstruction was proposed for field soybean plants (termed SfM-INGP), particularly for the low-cost, high-efficiency, and high-quality solution suitable for agricultural settings. Three algorithms were also integrated, including the motion detection adaptive frame extraction, Structure from Motion (SfM), and Instant Neural Graphics Primitives (Instant-NGP). Firstly, the panoramic videos of soybean plants were captured using a consumer-grade smartphone, which was accessible and affordable for most users. An adaptive frame extraction approach was utilized to obtain the multi-view image sequences using motion detection. The redundant data was effectively reduced to enhance computational efficiency. As such, only relevant frames were processed to streamline the reconstruction. Secondly, the SfM algorithm was employed to recover the camera poses from the multi-view images. A sparse point cloud was generated to serve as the critical information for subsequent reconstruction. The final stage involved the use of Instant-NGP. Multi-resolution hash encoding was performed on the images containing pose information. These encoded images were processed through a small multilayer perceptron, particularly for efficient 3D reconstruction. Experimental results indicate that the data acquisition successfully captured the multi-angle video data of soybean plants using only a handheld smartphone. The adaptive frame extraction effectively resolved the inconsistencies in movement speed when using handheld devices in field conditions. Furthermore, there was an average reprojection error of only 1.199 53 pixels, with an average 3D point track length of 5.761 77 during SfM sparse reconstruction. Each image was captured approximately 4 539.7 feature points, in order to successfully recover 100 camera poses with a 100% success rate. The 3D scenes that reconstructed by Instant-NGP closely were resembled the actual soybean plants, thereby maintaining complete and accurate morphology. Key structural features (such as branching) were well-defined without distortion or blurring, while the veins and textures of the leaves were rendered with realistic colors, accurately representing the intricate details of the soybean plant's leaves. In terms of reconstruction efficiency, the SfM-INGP demonstrated an average reconstruction time of just 2.82 min, indicating significant reductions of 90.7% and 99.4%, compared with the traditional Motion-Multi View Stereo (MVS) and Neural Radiance Fields (NeRF), respectively. In reconstruction quality, SfM-INGP was achieved in an average peak signal-to-noise ratio (PSNR) of 24.47 dB, which was improved by 15.4% and 9.3% over MVS and NeRF, respectively. The mean squared error of 0.15 was significantly lower than MVS's 0.46 and NeRF's 0.37. Additionally, the average memory consumption for SfM-INGP was recorded at 6.57 GB, which was slightly higher than MVS's 5.73 GB but considerably lower than NeRF's 14.81 GB. The effectiveness was verified to optimize the reconstruction efficiency, quality, accuracy, and resource consumption. Overall, the finding can provide essential technical support and a robust data foundation for the soybean information breeding platforms through low-cost data acquisition devices in actual agricultural environments. Furthermore, 3D reconstruction can offer a viable solution to large-scale informatization initiatives in modern agriculture.

       

    /

    返回文章
    返回