HOME

PestVL-Net Enhances Pest Recognition for Smart Agriculture

May 27, 2026 | By ZHANG Jie; ZHAO Weiwei

Recently, a research team led by Prof. ZHANG Jie and Prof. XIE Chengjun from the Institute of Intelligent Machines, Hefei Institutes of Physical Science, Chinese Academy of Sciences, has proposed a multimodal pest recognition framework, PestVL-Net.

The work has been accepted to the Findings of CVPR 2026.

Pest recognition is a key task in smart agriculture, but it remains challenging in real-world environments. Pest species are highly diverse and often visually similar, while collecting high-quality field data is still difficult. These factors limit the performance of traditional vision-based approaches in practical applications.

In this study, the team developed PestVL-Net, which combined visual and textual information for pest identification. On the visual side, the method was designed to focus on key regions of an image, which helped it capture subtle differences in shape and texture.

On the language side, the researchers introduced structured pest descriptions derived from agricultural expertise and multimodal language models. Combined with visual features, these descriptions helped the framework recognize pest species with subtle visual differences.

Experiments on multiple datasets, including newly constructed pest datasets and public benchmarks, showed that PestVL-Net consistently outperformed existing methods, with accuracy reaching around 88%–90%. Further analysis confirmed the contribution of each component in the framework.

"We are not only 'seeing' them, but also 'describing' them," said ZHANG Jie, a member of the team, adding that the framework provides a practical approach for smart farming, precision agriculture, and crop protection.

Feature Map Visualization of Different Modules (Image by ZHANG Jie)


Attachments Download:
Contact

Reference
Related Articles
Copyright © Hefei Institutes of Physical Science, CAS All Rights Reserved
Record number: 皖ICP备05001008号