Plant Pathol J > Volume 41(1); 2025 > Article
Kim, Yoon, and Ju: Diagnosis Anthracnose of Chili Pepper Using Convolutional Neural Networks Based Deep Learning Models

Abstract

Chili pepper (Capsicum annuum L.), one of the most economically important vegetable crops globally, faces significant economic risks from anthracnose, leading to yield losses of 10% as well as decreasing marketability. Early and accurate detection is essential for mitigating these effects. Recent advancements in deep learning, particularly in image recognition, offer promising solutions for plant disease detection. This study applies deep learning models—MobileNet, ResNet50v2, and Xception—using transfer learning to diagnose anthracnose in chili peppers. A key challenge is the need for large, labeled datasets, which are costly to obtain. The study aims to identify the minimum dataset size required for accurate and efficient disease diagnosis using limited data. Performance metrics, including precision, recall, F1-score, and accuracy, were evaluated across different dataset sizes (500, 1,000, 2,000, 3,000, and 4,000 samples). Results indicated that model performance improves with larger datasets, with ResNet50v2 and Xception requiring more data to achieve optimal accuracy, while MobileNet showed strong generalization even with smaller datasets. These findings underscore the effectiveness of transfer learning-based models in plant disease detection, offering practical guidelines for balancing data availability and model performance in agricultural applications. Source code available at https://github.com/smart-able/Anthracnose.git.

Chili peppers (Capsicum annuum L.) have been widely used for their numerous biological activities and spice in Korea (Xiang et al., 2021). They are an important crop accounting for 24% of the domestic seasoned vegetable market (Ministry of Agriculture, Food and Rural Affairs, 2022) and economically significant crops cultivated globally (Food and Agriculture Organization of the United Nations, 2022). There have been reports of 44 different types of chili pepper diseases in Korea (Kim et al., 2023). Among these, anthracnose is a major fungal disease of chili pepper fruit, causing significant yield loss and reducing the marketability of the fruit (Kiran et al., 2020). Anthracnose primarily affects the fruit, causing small dark green, water-soaked spots that gradually expand to form sunken spots (Ali et al., 2016). The spores of anthracnose are scattered by physical forces such as heavy rain, and typhoons. Accurate detection of pathogens helps in choosing the best management strategy for controlling this disease.
Traditionally, plants infected by anthracnose are identified through visual inspection on-site (Fox and Narra, 2006). While this approach may be effective for experienced farmers, it may not guarantee accuracy for inexperienced farmers. Observing morphological characteristics, utilizing molecular methods are widely adopted for accurately identifying pathogens (Freeman et al., 1998; Kiran et al., 2020; López et al., 2003). While these methods have demonstrated satisfactory sensitivity and specificity, their utilization necessarily requires sophisticated laboratory equipment, time-consuming processing of samples, and expertise for high-end analysis (Aljawasim et al., 2023; Sankaran et al., 2010). The precise and rapid diagnosis of plant diseases can aid in the development of early treatment methods, reducing significant economic losses.
Deep Learning techniques, particularly convolutional neural networks (CNNs), have demonstrated significant potential in the identification and diagnosis of plant diseases from images (Kamilaris and Prenafeta-Boldú, 2018; McCann et al., 2017; Zeng et al., 2021). These techniques can effectively analyze visual data to detect and classify various plant pathologies. However, most of the current research relies on datasets collected under controlled laboratory conditions (Sanida et al., 2023). While these datasets provide highly tested metrics score, their performance in real-world scenarios often falls short due to several interfering factors such as background noise, reflections, and varying lighting conditions (Ahmad et al., 2023).
One critical aspect that has not been addressed in the existing literature is the quantity of data required for training effective deep learning models. In the agriculture domain, researchers are involved in the role of data collectors. However, when they collect data, it is difficult to ensure that they collect the right data for deep learning training. Researchers in computer science spend a lot of time preprocessing this data to make it suitable for training models. This gap in understanding between data collection and data utilization is needed for a more efficient approach to research in this field. The primary objective of this study is to identify the minimum amount of data required for transfer learning of a pre-trained model in the context of plant disease identification. By determining this threshold, this research aims to provide practical guidelines for data collectors in agriculture, enabling them to gather the most relevant and efficient datasets.

Materials and Methods

Experimental setup. Training a deep neural network requires high-performance graphics processing units (GPUs). In this research, Google Colab and Kaggle Notebooks were used. The default GPU for Colab is an NVIDIA Tesla K80 with 12GB of VRAM (Video Random-Access Memory). Kaggle Notebook provides free access to the NVIDIA Tesla P100 GPU.
This paper concentrates on identifying anthracnose on chili pepper fruit using CNNs based transfer learning models (Fig. 1). Images were acquired not only from ‘AIHub’ but also from on-site and labeled by plant pathologists. Then, the image resizing, and data augmentation were applied as preprocessing. After separating the training set and test set, the training images were used for training three different pre-trained networks: MobileNet, ResNet50v2, and Xception. Subsequently, the model was tested with the test set, and performance metrics were compared. The metrics used in this research were accuracy, precision, recall, and F1-score.

Data description

The anthracnose dataset from ‘AIHub’, a Korean artificial intelligence infrastructure platform was used in this research. Not the stem or leaf, but the fruit of chili pepper was utilized, and some unclear images were removed. Plus, images were acquired at Jeonbuk National University farm, Jeonju, Korea, using a smartphone. The images were collected after the summer rainy seasons under real cultivation conditions in the field environment to reflect the real-life scenarios ranging from early stage to severe stages. The data was picked randomly from the total dataset and the whole training process was replicated 3 times.
Five datasets have been selected to investigate the minimum amount of data required for training a pre-trained model (Table 1). Transfer learning is employed to address the limitations of training data and time by utilizing information or features extracted from pre-trained CNNs. When using transfer learning for image classification with CNNs, it is crucial to choose the most appropriate scenario based on the number of available training images. By examining these five datasets, this research aims to identify the minimum dataset size required for effective training. For the validation of each model, the holdout validation method was used for the representativeness of each model. This will help to determine how transfer learning can be optimized to perform well even with limited training data.
To train and validate a model, the dataset must first be partitioned, which involves choosing what percentage of the data to use for the training, validation, and holdout test sets. The holdout validation is the way to split the dataset and evaluate the models that can be implemented with a small computational power. This study’s chili pepper disease dataset consists of 4,455 healthy and anthracnosed images. The dataset is divided by commonly used ratio of 80-10-10 into training, validation, and test datasets as followed by references (Mohanty et al., 2016; Ramcharan et al., 2017). Training data (80%) is used for training the model to learn features and patterns. Mini-batch is a method of training a model by dividing a large dataset into smaller portions (Ioffe and Szegedy, 2015). Instead of using the entire dataset at once, it is split into several smaller batches that are sequentially fed into the model. This approach reduces memory usage and allows the model to converge more quickly. After one round of mini-batch training, completing training on all mini-batches is referred to as an ‘epoch’ (Fig. 2). In this research, the batch size was set to 32 and the epoch to 100 by a reference (Naik et al., 2022). After each complete run of the calculation on the training batch with backpropagation algorithms, the model is validated with the validation dataset per epochs, which is the amount of training repetitions. This is to prevent overfitting to the training data. As the validation data are already used during training, 10% of the data is held for the test set unused for the evaluation.

Data preprocessing

This research tries to retain good model performance without requiring time-consuming preprocessing effort by limiting preprocessing steps to necessary approaches like rescaling, central cropping, resizing, and minimal augmentations. Original images are made up of RGB coefficients in the 0-255 range, but such values would be too high for our model to process. Normalizing the pixel values of images to the range [0, 1] by rescaling with a factor of 1/255. This step ensures that the input data is consistent with the requirements of many deep learning models that perform better with normalized data. This is a common preprocessing step that aids in faster convergence during training.
In this study, instead of using the original images, the cropped images were used to target the fruit of the chili pepper (Gu et al., 2021). A TensorFlow crop function is implemented to focus on the central part of the image. This reduces the input size to a fixed dimension, ensuring uniformity in the dataset and potentially removing irrelevant background information. Images might use less storage space if it is resized to a smaller size (Chen et al., 2016; Suh et al., 2003). For training machine learning models, all photos must have the same dimensions by resizing. The images are shaped as a 224 × 224 input shape to fit with the pre-trained input layer. Basic augmentation techniques such as horizontal flipping and random rotations are applied to increase the variability of the training data (Supplementary Fig. 1). This helps in improving the generalization capability of the model. While there is not a true increase in the amount of data physically, there is a real augmentation effect because the normalized data is randomly generated and applied to each batch.

Model architecture

Using pre-trained models is a common practice in deep learning to transfer learning from large datasets, such as ImageNet (Deng et al., 2009), to specific tasks. This approach often results in faster convergence and improved performance compared to training from scratch (Yosinski et al., 2014). The model creation function is designed, allowing for the selection of different pre-trained models such as MobileNet, ResNet50v2, and Xception. The network was utilized without its top classification layers (‘include_top=False’), allowing for custom top layers to be added for the task at hand. Depending on the specified model type, the function selects one of three pre-trained base models, which serve as feature extractors. The function incorporates several layers to enhance the model’s performance, including global average pooling, dense layers, dropout for regularization, and a final dense layer for classification.
  • (1) GlobalAveragePooling2D layer: This layer reduces the spatial dimensions of the feature maps, reducing the total number of parameters within the model and making the model less overfitting.

  • (2) Dense layer with 256 units and ReLU activation: A fully connected layer that introduces non-linearity to the model and connects each neuron in the previous layer to each neuron in this layer, enabling it to learn complex patterns.

  • (3) Dropout layer with 0.2 rate: A regularization technique that randomly drops 20% of the units in the previous layer during training, helping to prevent overfitting (Liu et al., 2023; Srivastava et al., 2014).

  • (4) Final dense layer with sigmoid activation: This layer outputs the probability of each class. The sigmoid activation is used for binary classification problems. For binary classification tasks, the output layer typically has one neuron with a sigmoid activation function, which outputs a probability value between 0 and 1.

Model training

The training was conducted for up to 100 epochs, with the training and validation generators providing the input data. The model’s performance was monitored using the specified callbacks to ensure optimal training and prevent overfitting. The Model Checkpoint callback (Srivastava et al., 2014) was configured to save the best model based on validation loss. This ensures that the model retains the best-performing weights during training. Model checkpointing prevents the loss of the best model parameters and allows the training process to be paused and resumed, which is crucial for long-running experiments. The Early Stopping callback (Prechelt, 1998) was used to stop training when no improvement in validation loss was observed for 10 consecutive epochs, with the best weights being restored (Wang et al., 2017). It is a regularization technique that prevents overfitting by terminating the training process, thus avoiding unnecessary computations and saving time. The ReduceLROnPlateau callback (Smith et al., 2018) was employed to reduce the learning rate by a factor of 0.1 if the validation loss did not improve for 5 epochs, helping the model’s learning rate converge more effectively.
The model was compiled with the AdamW optimizer (Loshchilov and Hutter, 2019), categorical binary cross-entropy loss, and accuracy as the evaluation metric. The AdamW optimizer was chosen for its ability to combine the advantages of Adam optimization with weight decay regularization (Bjorck et al., 2021). This helps in better generalization and reduced overfitting compared to the traditional Adam optimizer (Loshchilov and Hutter, 2019). The appropriate loss function is binary cross-entropy (Goodfellow et al., 2016) to handle binary classification problems, measuring the difference between the predicted probabilities and the actual binary lables.

Model evaluation strategies

A holdout set was provided as a final estimate of the model’s performance after it had been trained and validated. Holdout sets should never be used to make decisions about which algorithms to use or for improving or tuning algorithms. Two additional test sets were created; one was selected from 10% of the largest dataset (written as ‘M-set’ in this paper), and the other was selected by hand from early anthracnose symptom images that are difficult to distinguish (written as ‘E-set’ in this paper). For the evaluation, precision, recall, F1-score and accuracy were used.
(1)
Precision=(TP)(TP+FP)
(2)
Recall=(TP)(TP+FN)
(3)
F1-score=2*(precision*recall)(precision+recall)
(4)
Accuracy=(TP+TN)(TP+TN+FP+FN)
, where, TP, true positive; FP, false positive; TN, true negative; FN, false negative.

Statistical analysis with various test sets

Statistical tests were used to analyze the inference results. The goal is to compare the metric values (precision, recall, F1 score, and accuracy) based on the number of data (500, 1,000, 2,000, 3,000, and 4,000). The test data configuration was used to evaluate the performance metrics of different models. Two distinct test sets were employed: ‘the maximum quantity of test sets (‘M-set’)’, and ‘the early symptom test set (‘E-set’)’. ‘M-set’ was established using the test set from the 4,455-sample training dataset, which contains 446 samples. This test set serves as a benchmark for comparing model performance across different dataset sizes, ensuring consistency in evaluation. The ‘E-set’ specifically designed to include samples with early symptoms. This set consists of a total of 100 samples and aims to assess model performance in identifying early symptoms of anthracnose (Supplementary Fig. 2). Microsoft Excel (Microsoft 365) was used to organize the raw data. A comparison of the data means between treatments was performed based on Duncan’s test at the 0.05 probability level with IBM SPSS Statistics 29.0 (IBM Corp., Armonk, NY, USA) (Supplementary Tables 1 and 2).

Machine learning algorithms for visualization

The t-distributed stochastic neighbor embedding (t-SNE) technique (Van der Maaten and Hinton, 2008) was used to visualize features extracted from the fully connected layer of a model. The main goal of t-SNE is to maintain the transition from multi-dimensional points to two-dimensional plots so that if two points are close in the initial high-dimensional space, they stay close in the resulting projection. This allowed us to straightforward understand the distribution of features between different classes. This function takes data as input and extracts features by creating a model that connects the input of the given model and the output of the ‘global_average_pooling2d’ layer. After extracting features from the model and data, dimensionality reduction to two dimensions was performed using the t-SNE technique. Set the colors of the data points per class to visualize and draw a scatter plot based on them.

Results

MobileNet. The performance metrics of the MobileNet model were evaluated with varying amounts of data. The ANOVA analysis indicates that for the MobileNet model, there are statistically significant differences in all evaluated performance metrics (accuracy, F1-score, precision, and recall) between different data sample sizes. This suggests that the size of the data sample has a significant impact on the model’s performance. The performance metrics of the model’s inference included Precision, Recall, F1-score, and Accuracy, measured per different data sizes (500, 1,000, 2,000, 3,000, and 4,000 samples). Statistical analysis was conducted to identify significant differences between the groups. All metrics show higher values with more data, indicating better overall model performance.
In the ‘M-set’ test, precision did not show significant differences among the different data sizes (Fig. 3A). However, the other metrics (Recall, F1-score, and Accuracy) were classified as the best performance subset (marked as ‘a’) from 1,000 data point onward, indicating the model’s suitability for general applications in real-world scenarios (Fig. 3C, E and G). In the ‘E-set’ test, precision was the lowest at 500 data point but showed a significant rise as the data size increased, stabilizing around the high values from 2,000 data point onward (Fig. 3B). The recall value of the 2,000 data point showed a slight increase compared to the previous data point. Although the F1-score and accuracy showed a significant increase at 2,000 data point (Fig. 3F and H). For the MobileNet model, while training with 1,000 data point is practicable, it is not sufficient to extract unspecific features effectively. Therefore, it is recommended to train the model with at least 2,000 data point to achieve better performance and stability.

ResNet50v2

The trend of increasing metrics with more data is more obvious compared to MobileNet, as shown by the inference results with the ‘M-set’ (Fig. 4A, C, E and G). The recall value for this model is higher than for other metrics and models (Fig. 4C). Since recall reflects the rate of false negatives, it is a crucial metric for plant disease detection models. The homogeneous subsets indicating the best data point (e.g., ‘a’ or ‘ab’) start at different data sizes for each metric: 1,000 samples for precision (Fig. 4A), 2,000 samples for recall (Fig. 4C), and 3,000 samples for both F1-score (Fig. 4E) and accuracy (Fig. 4G). This variability makes it difficult to draw a single definitive conclusion. Therefore, the values with the ‘E-set’ inference were examined to compare the results. For the ‘E-set’ inference results, the precision value fell into a similar subset with the best data point starting from 2,000 samples (Fig. 4B). For the other metrics, only the maximum data point showed significantly high values, indicating that ResNet50v2 benefits significantly from larger datasets (Fig. 4D, F and H).

Xception

The significant difference between data points is not clear in this model. This is because the standard deviation of 500 data point was high (Fig. 5A, C, D and G). As the F1-score result, the model shows the difference starting from the 1,000 dataset with the ‘M-set’ (Fig. 5E). With the ‘E-set’, the model’s performance started well from the 3,000 dataset as the precision and accuracy value (Fig. 5B and H). The F1-score represented higher values at 1,000 data point and 4,000 data point compared with lower data values. Xception maintained high precision and accuracy values from 3,000 samples onward, demonstrating its capacity to accurately detect significant features.

t-SNE visualization

The t-SNE plots reveal how well each model distinguishes between anthracnose and healthy classes as training progresses, highlighting the effectiveness of each architecture in feature discrimination. Generally, the more data points, the separation between classes is more pronounced. For MobileNet (Fig. 6, first row), the classes are partially separated but still show a significant amount of overlap at 500 data point. At 1,000 data point, the clusters become more distinct, though some overlap remains. After the 2,000 data point, the separation improves, with clearer clustering of the anthracnose disease and health. For ResNet50v2 (Fig. 6, second row), from 500 to 2,000 data points, there is substantial overlap between the classes. After 3000 data point, the separation improves markedly, with more defined clusters. For Xception (Fig. 6, third row), the classes show noticeable overlap at 500 data point, like other models. From 1,000 to 3,000 data points, the clusters are well-defined with minimal overlap. At 4,000 data point, the classes are clearly separated with distinct clusters.

Discussion

Deep learning techniques have revolutionized the field of image analysis. The integration of RGB imaging with deep learning technologies presents a promising approach to plant disease detection. The accessibility of RGB imaging, combined with the power of pre-trained deep learning models, enables efficient and accurate identification of plant diseases. Ongoing research and development aim to adapt these technologies to field conditions, ultimately enhancing their practical application in agriculture. These techniques, including CNNs, have demonstrated exceptional performance in image classification tasks (Kaya and Gürsoy, 2023).
In this research, these techniques were applied to diagnose chili pepper anthracnose. Our evaluation of the performance metrics of the MobileNet, ResNet50v2, and Xception models across varying data sizes provides several key insights into their effectiveness for plant disease detection. MobileNet demonstrated superior general inference ability with smaller datasets compared to ResNet50v2 and Xception. MobileNet’s ability to perform well with smaller datasets is noteworthy, particularly for scenarios where data collection is challenging or limited. ResNet50v2 and Xception, while requiring more data to achieve optimal performance, show significant improvements in metrics as data size increases. This suggests that for detailed and accurate plant disease detection, larger datasets are beneficial. A high recall value, as seen in ResNet50v2, indicates a model’s ability to detect the majority of actual disease cases, minimizing the risk of overlooking infected plants. As the t-SNE result, Xception with 4,000 data point is the best feature extractor.
In conclusion, while all three models exhibit strengths in different areas, the choice of model should be guided by the specific requirements of the application, including data availability, the need for early and accurate detection, and computational resources. This approach not only optimizes the use of resources but also bridges the gap between agricultural research and computer science, facilitating more unified and productive collaborations. For agricultural researchers, it will provide clear guidelines on the amount and type of data required for training effective deep learning models. For computer scientists, it can reduce the burden of data preprocessing offers and give some insights into the challenges of working with real-world agricultural data. This study aims to enhance the efficiency and effectiveness of collaborative research efforts, ultimately contributing to the advancement of agricultural practices through the application of deep learning technologies.
However, the black-box nature of deep learning models makes it difficult to understand how these models effectively classify data. Although there were visualized results of model inference, the underlying decision-making processes remain opaque. The rapid development of AI, including Explainable AI (XAI), aims to provide insights into the decision-making processes of deep neural networks. The hyperparameter tuning for each model was not implemented, leaving room for performance improvement. By setting a minimum number of datasets, hyperparameter tuning can be performed to improve model accuracy. In addition, exploring more advanced models and ensemble models that combine the strengths of different architectures could potentially provide greater accuracy and robustness in plant disease detection (Wu et al., 2020). Notably, in 2017, the transformer model was introduced, which serves as the fundamental architecture of ChatGPT (Hue et al., 2024). Although originally based on language or sequence data, this architecture has numerous applications in computer vision, such as the Vision Transformer (ViT) (Borhani et al., 2022).
Generative AI, particularly generative adversarial networks (GANs), presents another promising advancement. GANs can generate synthetic data, which could be used to train models and enhance their generalization performance (Park et al., 2020). To deal with the data shortage problem, Park et al. (2020) evaluated the impact of deep convolutional generative adversarial network (DCGAN) image data augmentation to improve the performance of a CNN-based tomato disease classifier in the presence of unbalanced image data. The impact of DCGAN image data augmentation on a CNN-based tomato disease classifier was found that image data augmentation can improve accuracy by up to 30% (Park et al., 2020).
Considering the rapid advancements in deep learning, significant progress in the field of plant pathology is expected. The developed models should be capable of accurately classifying chili pepper fruits as either diseased by anthracnose or healthy, with the ability to adapt to field conditions. While deep learning has achieved remarkable success in various domains, significant gaps remain in our understanding of the underlying factors contributing to this success. Future research should address these gaps to further improve the effectiveness and reliability of deep learning applications in agriculture.

Notes

Conflicts of Interest

No potential conflict of interest relevant to this article was reported.

Acknowledgments

This work was supported by the Korea Institute of Planning and Evaluation for Technology in Food, Agriculture, Forestry and Fisheries (IPET), Ministry of Agriculture, Food and Rural Affairs, Republic of Korea. (Project No. 120080-05)

Electronic Supplementary Material

Supplementary materials are available at The Plant Pathology Journal website (http://www.ppjonline.org/).

Fig. 1
Overview of the proposed system for chili pepper fruit disease detection using pre-trained models. This represents an overview of the proposed chili pepper fruit disease detection using the pre-trained models. The process begins with a dataset of chili fruit images, which are first resized and then split into training, validation, and test sets. The resized images are fed into a deep learning model pipeline that includes three kinds of pre-trained convolutional neural networks: MobileNet, ResNet50, and Xception. These models undergo transfer learning, where their weights are fine-tuned for the specific task of classifying chili peppers as either diseased or healthy. The architecture of each model includes layers for global average pooling, dense connections, dropout, and a sigmoid activation function to produce the final binary classification. The performance of the trained models is then evaluated to assess their accuracy in detecting diseases in chili pepper fruits.
ppj-oa-11-2024-0178f1.jpg
Fig. 2
Overview of dataset partitioning and the concepts of batch and epoch. The dataset is divided into three subsets: 80% for training, 10% for validation (Val), and 10% for testing (Test). During the training process, the training set is further divided into smaller units called batches. A single epoch refers to one complete cycle through the entire training set, composed of 32 batches. The diagram illustrates that the model undergoes 100 epochs, meaning the entire training data is passed through the model 100 times, helping it to learn and improve its performance.
ppj-oa-11-2024-0178f2.jpg
Fig. 3
The inference result by MobileNet with ‘M-set’ (A, C, E, G) and with ‘E-set’ (B, D, F, H). The mean per metric values the performance metrics of the MobileNet model with varying amounts of data. (A, B) Precision. (C, D) Recall. (E, F) F1-score. (G, H) Accuracy per different amount of data. Different letters indicate statistically significant differences among data points and “ns” indicates non-significant differences as determined (P < 0.05, Duncan’s test). The statistical analysis was conducted with the SPSS 29.0 program (IBM).
ppj-oa-11-2024-0178f3.jpg
Fig. 4
The inference result by ResNet50v2 with ‘M-set’ (A, C, E, G) and with ‘E-set’ (B, D, F, H). The mean of metric values per the performance metrics of the ResNet50v2 model with varying amounts of data. (A, B) Precision. (C, D) Recall. (E, F) F1-score. (G, H) Accuracy per different amount of data. Different letters indicate statistically significant differences among data points and “ns” indicates non-significant differences as determined (P < 0.05, Duncan’s test). The statistical analysis was conducted with the SPSS 29.0 program (IBM).
ppj-oa-11-2024-0178f4.jpg
Fig. 5
The inference result by Xception with ‘M-set’ (A, C, E, G) and with ‘E-set’ (B, D, F, H). The mean of metric values per the performance metrics of the Xception model with varying amounts of data. (A, B) Precision. (C, D) Recall. (E, F) F1-score. (G, H) Accuracy per different amount of data. Different letters indicate statistically significant differences among data points and “ns” indicates non-significant differences as determined (P < 0.05, Duncan’s test). The statistical analysis was conducted with the SPSS 29.0 program (IBM).
ppj-oa-11-2024-0178f5.jpg
Fig. 6
t-distributed stochastic neighbor embedding (t-SNE) visualization of feature embeddings for each model. This figure presents high-dimensional feature space into two components. The x-axis (t-SNE Component 1 with major scale marked −80, −60, −40, −20, 0, 20, 40, and 60) and y-axis (t-SNE Component 2 with major scale marked −80, −60, −40, −20, 0, 20, 40, 60, and 80) do not represent specific variables in the original feature space but represent the t-SNE Components 1 and 2 respectively abstracting dimensions optimized to preserve the local and global structure of the data in this reduced 2D space. Each point corresponds to a data instance, where proximity between points reflects similarity based on the original feature space. Red dot marks represent instances where the model inferred a pepper fruit as infected by anthracnose, while blue dot marks indicate healthy chili pepper fruit. If the model performs well, the locations of these markers should be clearly distinguishable in the feature space. Zoom in for better visibility.
ppj-oa-11-2024-0178f6.jpg
Table 1
Five different dataset configurations based on the amount of data
Data Class Training Validation Test Total
4,000 Anthracnose 1,665 185 206 2,056
Healthy 1,943 216 240 2,399
3,000 Anthracnose 1,215 135 150 1,500
Healthy 1,215 135 150 1,500
2,000 Anthracnose 810 90 100 1,000
Healthy 810 90 100 1,000
1,000 Anthracnose 405 45 50 500
Healthy 405 45 50 500
500 Anthracnose 202 23 25 250
Healthy 203 22 25 250

References

Ahmad, A., El Gamal, A. and Saraswat, D. 2023. Toward generalization of deep learning-based plant disease identification under controlled and field conditions. IEEE Access 11:9042-9057.
crossref
Ali, A., Bordoh, P. K., Singh, A., Siddiqui, Y. and Droby, S. 2016. Post-harvest development of anthracnose in pepper (Capsicum spp): etiology and management strategies. Crop Prot. 90:132-141.
crossref
Aljawasim, B. D., Samtani, J. B. and Rahman, M. 2023. New insights in the detection and management of anthracnose diseases in strawberries. Plants 12:3704.
crossref pmid pmc
Bjorck, J., Weinberger, K. Q. and Gomes, C. 2021. Understanding decoupled and early weight decay. Proc. AAAI Conf. Artif. Intell. 35:6777-6785.
crossref pdf
Borhani, Y., Khoramdel, J. and Najafi, E. 2022. A deep learning based approach for automated plant disease classification using vision transformer. Sci. Rep. 12:11554.
crossref pmid pmc pdf
Chen, J., Bai, G., Liang, S. and Li, Z. 2016. Automatic image cropping: a computational complexity study. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 507-515. Institute of Electrical and Electronics Engineers, New York, USA.
crossref
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K. and Fei-Fei, L. 2009. ImageNet: a large-scale hierarchical image database. IEEE Conference on Computer Vision and Pattern Recognition. pp. 248-255. Institute of Electrical and Electronics Engineers, New York, USA.
crossref
Food and Agriculture Organization of the United Nations 2022. Crops and livestock products. The FAO Statistical Database-Agriculture. Food and Agriculture Organization of the United Nations, Rome, Italy.
Fox, R. T. V. and Narra, H. P. 2006. Plant disease diagnosis. In: The epidemiology of plant diseases, eds. by B. Cookr, D. Jones and B. Kaye, pp. 1-42. Springer, Dordrecht, Netherlands.
crossref
Freeman, S., Katan, T. and Shabi, E. 1998. Characterization of Colletotrichum species responsible for anthracnose diseases of various fruits. Plant Dis. 82:596-605.
crossref pmid
Goodfellow, I., Bengio, Y. and Courville, A. 2016. Deep learning. MIT Press, Cambridge, MA, USA. pp. 800.
Gu, Y. H., Yin, H., Jin, D., Park, J.-H. and Yoo, S. J. 2021. Image-based hot pepper disease and pest diagnosis using transfer learning and fine-tuning. Front. Plant Sci. 12:724487.
crossref pmid pmc
Hue, Y., Kim, J. H., Lee, G., Choi, B., Sim, H., Jeon, J., Ahn, M.-I., Han, Y. K. and Kim, K.-T. 2024. Artificial intelligence plant doctor: plant disease diagnosis using Gpt4-vision. Res. Plant Dis. 30:99-102.
crossref pdf
Ioffe, S. and Szegedy, C. 2015. Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on Machine Learning, eds. by F. Bach and D. Blei, pp. 448-456. JMLR.org, Lille, France.
Kamilaris, A. and Prenafeta-Boldú, F. X. 2018. Deep learning in agriculture: a survey. Comput. Electron. Agric. 147:70-90.
crossref
Kaya, Y. and Gürsoy, E. 2023. A novel multi-head CNN design to identify plant diseases using the fusion of RGB images. Ecol. Inform. 75:101998.
crossref
Kim, S.-H., Choi, J., Choi, Y.-J., Park, B.-Y., Lee, S.-H., Kim, G. H., Kong, H. G., Kim, D., Kim, S., Kim, Y., Back, C.-G., Byun, H.-S., Seo, J. K., Yu, J. M., Yoon, J.-Y., Lee, D.-H., Lee, S.-Y., Lim, S., Jeon, Y., Chun, J., Choi, I., Choi, I.-Y., Choi, H.-W., Hong, J. S. and Hong, S.-B. 2023. Introduction of List of Plant Diseases in Korea 6.1st edition (2023 revised version). Res. Plant Dis. 29:331-344 (in Korean).
crossref pdf
Kiran, R., Akhtar, J., Kumar, P. and Shekhar, M. 2020. Anthracnose of chilli: status, diagnosis, and management. In: Capsicum, eds. by A. Dekebo, IntechOpen, London.
crossref
Liu, Z., Xu, Z., Jin, J., Shen, Z. and Darrell, T. 2023. Dropout reduces underfitting. In: Proceedings of the 40th International Conference on Machine Learning, eds. by A. Krause, E. Brunskill, K. Cho, B. Engelhardt, S. Sabato and J. Scarlett, pp. 22233-22248. JMLR.org, Hawaii, HI, USA.
López, M. M., Bertolini, E., Olmos, A., Caruso, P., Gorris, M. T., Llop, P., Penyalver, R. and Cambra, M. 2003. Innovative tools for detection of plant pathogenic viruses and bacteria. Int. Microbiol. 6:233-243.
crossref pmid pdf
Loshchilov, I. and Hutter, F. 2019 Decoupled weight decay regularization Preprint at arXiv: https://arxiv.org/abs/1711.05101.
McCann, M. T., Jin, K. H. and Unser, M. 2017. Convolutional neural networks for inverse problems in imaging: a review. IEEE Signal Process. Mag. 34:85-95.
crossref
Ministry of Agriculture, Food and Rural Affairs 2022 Agricultural and forestry production index Korean Statistical Information Service, URL https://www.mafra.go.kr/bbs/home/798/569354/artclView.do [12 November 2024].
Mohanty, S. P., Hughes, D. P. and Salathé, M. 2016. Using deep learning for image-based plant disease detection. Front. Plant Sci. 7:1419.
crossref pmid pmc
Naik, B. N., Malmathanraj, R. and Palanisamy, P. 2022. Detection and classification of chilli leaf disease using a squeeze-and-excitation-based CNN model. Ecol. Inform. 69:101663.
crossref
Park, J.-Y., Kim, H.-J. and Kim, K. 2020. Accessing impact of DCGAN image data augmentation for CNN based tomato disease classification. J. Digit. Contents Soc. 21:959-967.
crossref
Prechelt, L. 1998. Automatic early stopping using cross validation: quantifying the criteria. Neural Netw. 11:761-767.
crossref pmid
Ramcharan, A., Baranowski, K., McCloskey, P., Ahmed, B., Legg, J. and Hughes, D. P. 2017. Deep learning for image-based cassava disease detection. Front. Plant Sci. 8:1852.
crossref pmid pmc
Sanida, M. V., Sanida, T., Sideris, A. and Dasygenis, M. 2023. An efficient hybrid CNN classification model for tomato crop disease. Technologies 11:10.
crossref
Sankaran, S., Mishra, A., Ehsani, R. and Davis, C. 2010. A review of advanced techniques for detecting plant diseases. Comput. Electron. Agric. 72:1-13.
crossref
Smith, S. L., Kindermans, P.-J., Ying, C. and Le, Q. V. 2018. Don’t decay the learning rate, increase the batch size. In: ICLR 2018 Conference. International Conference on Learning Representations; Vancouver, Canada.
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. and Salakhutdinov, R. 2014. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15:1929-1958.
Suh, B., Ling, H., Bederson, B. B. and Jacobs, D. W. 2003. Automatic thumbnail cropping and its effectiveness. In: Proceedings of the 16th Annual ACM Symposium on User Interface Software and Technoloy: UIST ‘03; pp 95-104. Association for Computing Machinery, Vancouver, Canada.
crossref
Van der Maaten, L. and Hinton, G. 2008. Visualizing data using t-SNE. J. Mach. Learn. Res. 9:2579-2605.
Wang, G., Sun, Y. and Wang, J. 2017. Automatic image-based plant disease severity estimation using deep learning. Comput. Intell. Neurosci. 2017:2917536.
crossref pmid pmc pdf
Wu, Q., Ji, M. and Deng, Z. 2020. Automatic detection and severity assessment of pepper bacterial spot disease via multimodels based on convolutional neural networks. Int. J. Agric. Environ. Inf. Syst. 11:29-43.
crossref
Xiang, Q., Guo, W., Tang, X., Cui, S., Zhang, F., Liu, X., Zhao, J., Zhang, H., Mao, B. and Chen, W. 2021. Capsaicin—the spicy ingredient of chili peppers: a review of the gastrointestinal effects and mechanisms. Trends Food Sci. Technol. 116:755-765.
crossref
Yosinski, J., Clune, J., Bengio, Y. and Lipson, H. 2014. How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems 27, NIPS’14: Proceedings of the 28th International Conference on Neural Information Processing Systems, eds. by Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence and K. Q. Weinberger, pp. 3320-3328. Springer, New York, USA.
Zeng, Y., Zhao, Y., Yu, Y., Tang, Y. and Tang, Y. 2021. Pepper disease detection model based on convolutional neural network and transfer learning. IOP Conf. Ser. Earth Environ. Sci. 792:012001.
crossref pdf
TOOLS
METRICS Graph View
  • 0 Crossref
  •  0 Scopus
  • 231 View
  • 22 Download
ORCID iDs

Ho-Jong Ju
https://orcid.org/0000-0001-8634-2025

Related articles


ABOUT
BROWSE ARTICLES
EDITORIAL POLICY
FOR CONTRIBUTORS
Editorial Office
Rm,904 (New Bldg.) The Korean Science & Technology Center 22,
Teheran-ro 7-Gil, Gangnamgu, Seoul 06130, Korea
Tel: +82-2-557-9360    Fax: +82-2-557-9361    E-mail: paper@kspp.org                

Copyright © 2025 by Korean Society of Plant Pathology.

Developed in M2PI

Close layer
prev next