[InsectMamba] Classification Of Pests Using State Space Models To Support Smart Agriculture
3 main points
✔️ Classification of insect pests is an important issue for agriculture, but the high degree of mimicry and species diversity of insects make it a difficult problem to effectively capture their visual features
✔️ In this study, we proposed InsectMamba, which integrates state-space models, CNNs, self-attention and MLP, andMix-SSM block and selective module design to comprehensively capture both local and global features
✔️ It is expected to make a significant contribution to the realization of smart agriculture and the construction of pest management systems
InsectMamba: Insect Pest Classification with State Space Model
written by Qianning Wang, Chenglin Wang, Zhixin Lai, Yucheng Zhou
(Submitted on 4 Apr 2024)
Comments: 13 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
code:
The images used in this article are from the paper, the introductory slides, or were created based on them.
Introduction
Classification of insect pests is an important issue in agriculture. Accurate identification of harmful pests reduces crop damage and ensures food security and environmental sustainability.
However, due to the high degree of mimicry and species diversity between pests and their natural environment, we are faced with the challenge that visual feature extraction is very difficult. Existing methods struggle to extract the detailed features needed to distinguish closely related pest species.
Even if state-of-the-art deep learning approaches are utilized, challenges still remain due to the large similarity between pests and backgrounds. Against this backdrop, there is a strong need to develop more effective pest classification models.
Proposed Method (InsectMamba)
The core of InsectMamba, the "Mix-SSM block," is a structure that cleverly combines four visual encoding methods. Specifically, they are as follows: a) InsectMamba
1. state space model (SSM): can effectively model visual features over time. It is good at capturing long-range dependencies.
2. convolutional neural network (CNN): excellent for extracting local visual features.
3. multi-head self-attention (MSA): Captures global contextual information and complements CNN weaknesses.
4. multi-layer perceptron (MLP): effectively extracts features in the channel direction.
By combining the features of these four methods, InsectMamba is able to capture the visual characteristics of pests from multiple angles.
In addition, the proposed "selective module" adaptively integrates the feature representations obtained by these encoding methods. By dynamically assigning importance to each channel, pest features can be effectively modeled.
Thus, through its innovative design, InsectMamba offers a comprehensive solution to the challenges of pest classification.
Experiment
The paper evaluates the performance of InsectMamba by utilizing five insect damage classification datasets. All of these datasets were selected for their high visual similarity between insects and backgrounds and large species diversity, which makes them challenging to classify for insect damage classification. Specific datasets include:.
- Farm Insects: 15 insect pests are included, with 1,368 training data and 160 test data.
- Agricultural Pests: 12 species of agricultural pests, with 240 training data and 5,254 test data.
- Insect Recognition: 24 insect species, 768 training data and 612 test data.
- Forestry Pest Identification: 31 pest species, 599 training data and 6,564 test data.
- IP102: 102 pest species, 1,909 training data and 65,805 test data.
Using these challenging data sets, we evaluated InsectMamba's performance against powerful existing models (ResNet, DeiT, Swin Transformer, and Vmamba). The results showed that InsectMamba performed best on all metrics (Accuracy, Precision, Recall, and F1 Score).
Particularly noteworthy are the experiments comparing the feature integration methods shown in Figure 3. Here, the proposed "selective module" performs the best, confirming the importance of adaptive feature integration.
In addition, Figure 4 examines the impact of the convolution kernel size for the selective module, finding that 3x3 gives the best results for the Farm Insects dataset, while 1x1 is optimal for IP102. This suggests that it is important to choose the appropriate kernel size according to the characteristics of the dataset.
And the results of the ablation experiments show that each of the SSM, CNN, MSA, and MLP elements that make up the Mix-SSM block has its own unique contribution. These results demonstrate that InsectMamba can provide the most effective solution to the challenges of insect pest classification.
Conclusion
In this study, we proposed a new model, InsectMamba, to address the challenge of insect damage classification; InsectMamba was an innovative architecture that combined various visual encoding methods.
Experimental results show that InsectMamba performs well on five challenging insect pest classification datasets, significantly outperforming existing powerful models. In addition, through ablation experiments, it became clear that each element of the proposed method has its own unique contribution. The analysis also included a detailed study of key design aspects such as feature integration methods and convolutional kernel size optimization, and the results support the high versatility and practicality of InsectMamba.
Looking ahead, it is important to further validate the usefulness of InsectMamba through evaluations on even larger data sets and in real environments. In addition, efforts toward practical application, such as investigating efficient implementation methods under hardware limitations, are also expected. The results of this research are expected to make a significant contribution to the automation of insect control and the realization of smart agriculture.
Categories related to this article