In a recent article published in Scientific Reports, researchers proposed an innovative technique known as the high-level deformation-perception network with multi-object search non-maximum suppression (HDMNet) for automating pear-picking tasks. They aimed to leverage deep learning algorithms to create a robust and precise visual system for robotic pear harvesting.
Background
Pear is one of the five largest fruits in the world, with high economic and nutritional importance. However, picking them remains labor-intensive and time-consuming. Therefore, the integration of automation technology, specifically through the development of fruit-picking robots, is crucial to improving agricultural production efficiency and competitiveness.
The vision system is the core component of the fruit-picking robot, which is responsible for detecting and locating the fruit, fruit stalk, and fruit calyx, and providing the position information to the central processing unit to guide the robot to complete the picking operation.
However, these systems face significant difficulties, including complex growth environments, background noise, occlusions, and variations in pear size. These complexities pose considerable challenges to traditional object detection methods.
About the Research
In this paper, the authors introduced HDMNet, a high-precision object detection network for automated pear-picking tasks, which can overcome the limitations of the current object detection methods using deep learning. Their network is based on you only look once version 8 (YOLOv8), which is a state-of-the-art object detection model that has a simple network structure and a balance between speed and accuracy.
However, YOLOv8 falls short in effectively addressing background information and mutual occlusion among multiple pears. It may cause low detection accuracy and cannot meet the needs of complex automated pear-picking detection tasks.
To address these problems, the researchers proposed three key enhancements to YOLOv8: (1) a high-level semantic focused attention mechanism module (HSA), which eliminates the irrelevant background details and highlights the main body; (2) a deformation-perception feature pyramid network (DP-FPN), which improves the accuracy of long-distance and small scale fruit by combining shallow visual features and deep semantic features across the channel and spatial dimensions; and (3) a multi-object search non-maximum suppression (MO-NMS), which handles the occlusion and overlap of multiple pears by filtering based on the relationships between all anchor boxes within the image.
The study also collected and labeled a dataset of 8363 images of pear trees in orchards. This dataset covers various forms of pears within automated picking scenes. It serves as the foundation for training, testing, and benchmarking the HDMNet model against other state-of-the-art object detection models.
Research Findings
The study demonstrated that the HDMNet model outperformed in automated pear-picking detection tasks. With a low parameter count of 12.9 million and computational load of 41.1 giga floating-point operations per second (GFLOPs), it achieved a high mean average precision (mAP) of 75.7%, mAP50 of 93.6%, mAP75 of 70.2%, and operated at 73.0 frames per second (FPS).
These outcomes highlighted HDMNet's strengths in real-time detection, efficiency with low parameters and computations, high precision, and accurate localization compared to leading object detection models like MobileNet SSD v2, EfficientDet, and ViT-based models.
The authors conducted ablation experiments to validate the contribution of each component of HDMNet to overall detection performance. They also integrated HDMNet into a custom-built Internet of Things (IoT) system, allowing it to operate seamlessly in real-world pear harvesting tasks and improve system functionality.
This integration underscored the study’s primary goal: to apply research findings to practical, real-world scenarios. By embedding HDMNet into the IoT system, the study provided crucial technical support for advancing automation and intelligence in pear harvesting. This integration offers farmers efficient and reliable solutions, significantly enhancing the efficiency and quality of the pear harvesting process.
Applications
The proposed model has potential implications in various fields that require high-precision and real-time object detection, particularly in agriculture. As a robust vision system, it can be integrated into automated picking and positioning systems, such as robots and automatic picking devices.
Beyond pears, HDMNet can be adapted for detecting other fruits like apples, oranges, or grapes by fine-tuning parameters and training data. With additional features and modules, it can be used for crop disease detection, weed identification, and animal tracking. This adaptability highlights HDMNet’s versatility and potential to enhance efficiency and accuracy in diverse agricultural tasks.
Conclusion
In summary, the HDMNet technique proved efficient, effective, and robust for automated pear-picking tasks. It showcased the potential to revolutionize agricultural practices globally by providing an advanced automation solution.
Moving forward, the researchers recognized the importance of further model compression and optimization to support large-scale deployment in real-world agricultural settings. They proposed focusing on achieving a balance between model simplicity and detection accuracy to develop a cost-effective and highly efficient solution for automated pear-picking.
Journal Reference
Zhao, P., Zhou, W. & Na, L. High-precision object detection network for automate pear picking. Sci Rep 14, 14965 (2024). DOI: 10.1038/s41598-024-65750-6, https://link.springer.com/article/10.1038/s41598-024-65750-6
Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.