0% found this document useful (0 votes)
28 views5 pages

TM 2018

The document discusses tomato leaf disease detection using convolutional neural networks. It presents a methodology using a variation of the LeNet model to classify tomato leaf images into disease classes or healthy with 94-95% accuracy. The methodology involves data acquisition from a public dataset, pre-processing of images, and classification using a convolutional neural network.

Uploaded by

Srikanth Pulyala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views5 pages

TM 2018

The document discusses tomato leaf disease detection using convolutional neural networks. It presents a methodology using a variation of the LeNet model to classify tomato leaf images into disease classes or healthy with 94-95% accuracy. The methodology involves data acquisition from a public dataset, pre-processing of images, and classification using a convolutional neural network.

Uploaded by

Srikanth Pulyala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Proceedings of 2018 Eleventh International Conference on Contemporary Computing (IC3), 2-4 August, 2018, Noida, India

Tomato Leaf Disease Detection using Convolutional


Neural Networks

Prajwala TM, Alla Pranathi, Kandiraju Sai Ashritha, Nagaratna B. Chittaragi*, Shashidhar G. Koolagudi
Department of Computer Science and Engineering, National Institute of Technology Karnataka, Surathkal
Email: {prajwala.tm, pakasarka, ashritha1615, nbchittaragi}@gmail.com, [email protected]

Abstract—The tomato crop is an important staple in the implement common disease prevention mechanisms, as they
Indian market with high commercial value and is produced in lack expert advice on how to deal with their crop infestation
large quantities. Diseases are detrimental to the plant’s health [2]. There has been circumstances where due to inadequate
which in turn affects its growth. To ensure minimal losses to the knowledge or misinterpretation regarding the intensity of the
cultivated crop, it is crucial to supervise its growth. There are disease, over-dosage or under-dosage of the pesticide has
numerous types of tomato diseases that target the crop’s leaf at
an alarming rate. This paper adopts a slight variation of the
resulted in crop damage. This is the underlying motivation
convolutional neural network model called LeNet to detect and for the proposed methodology that aims to accurately detect
identify diseases in tomato leaves. The main aim of the proposed and classify diseases in the tomato crop.
work is to find a solution to the problem of tomato leaf disease
detection using the simplest approach while making use of min- The methodology suggested in the paper pertains to the
imal computing resources to achieve results comparable to state most common diseases found in the tomato plant like,
of the art techniques. Neural network models employ automatic Bacterial leaf spot and Septorial leaf spot, Yellow Leaf Curl
feature extraction to aid in the classification of the input image among many others. Any leaf image given as input can be
into respective disease classes. This proposed system has achieved classified into one of the disease classes or can be deemed
an average accuracy of 94-95% indicating the feasibility of the healthy. The database used for evaluation is a subset of Plant
neural network approach even under unfavourable conditions.
Village [6], a repository that contains 54,306 images of 14
Keywords—leaf disease detection, neural network, convolution, crops infested with 26 diseases. The subset includes around
LeNet 18160 images of tomato leaf diseases.

I. I NTRODUCTION Broadly, the proposed methodology consists of three major


steps: data acquisition, pre-processing and classification.
India is a country with a majority of the population The images used for the implementation of the proposed
relying heavily on the agricultural sector. Tomato is the methodology were acquired from a publicly available dataset
most common vegetable used across India. The three most called Plant Village, as mentioned earlier. In the next step, the
important antioxidants namely vitamin E, vitamin C and images were re-sized to a standard size before feeding it into
beta-carotene are present in tomatoes. They are also rich in the classification model. The final step is the classification
potassium, a very important mineral for good health. Tomato of the input images with the use of a slight variation of the
crop cultivation area in India spans around 3,50,000 hectares deep learning convolutional neural network (CNN) standard
approximately and the production quantities roughly sum model called the LeNet which consists of the convolutional,
up to 53,00,000 tons, making India the third largest tomato activation, pooling and fully connected layers.
producer in the world. The sensitivity of crops coupled with
climatic conditions have made diseases common in the tomato The paper is organized as follows: Section II focuses
crop during all the stages of its growth. Disease affected on the prominent work done in regard to the concerned
plants constitute 10-30% of the total crop loss. Identification field. Section III elucidates the proposed methodology and
of such diseases in the plant is very important in preventing the model used along with the steps taken to obtain the
any heavy losses in yield as well as the quantity of the necessary results. Section IV pertains to the results and the
agricultural product. Monitoring the plant diseases manually analysis of the proposed methodology. Section V includes the
is a difficult task due to its complex nature and is a time conclusion of the paper and provides the scope for future work.
consuming process. Therefore, there is a need to reduce
the manual effort put into this task, while making accurate
predictions and ensuring that the farmers’ lives are hassle free.
II. L ITERATURE S URVEY
Visually observable patterns are difficult to decipher at It is important to recognize the previous research done
a single glance, leading to many farmers making inaccurate in regard to this field to be able to correctly advance in
assumptions regarding the disease. As a result, prevention the right direction. Plant leaf disease detection has been a
mechanisms taken by the farmers may be ineffective and major research area in which both image processing and deep
sometimes harmful. Farmers usually come together and learning techniques have been widely used for its accurate
classification. In this paper, we discuss the most popularly
*Dept. of Information Science and Engg., Siddaganga Institute of Tech- incorporated techniques in literature in the relevant field. Two
nology, Tumkur common tomato plant diseases look like the ones shown in

978-1-5386-6835-1/18/$31.00 ©2018 IEEE

Authorized licensed use limited to: Cornell University Library. Downloaded on September 02,2020 at 16:36:19 UTC from IEEE Xplore. Restrictions apply.
Proceedings of 2018 Eleventh International Conference on Contemporary Computing (IC3), 2-4 August, 2018, Noida, India

Fig. 1 and Fig. 2 and healthy tomato leaves are shown in Fig.
3. In order to overcome the problem of the above paper,
the authors in [3] have proposed various segmentation, feature
extraction and classification techniques that identify and
detect the type of the disease using the diseased image to
conduct classification. The leaf image given as input to the
system was pre-processed by smoothing it or enhancing
the image by performing histogram equalization. To obtain
the affected area, different segmentation techniques like
K-Means clustering have been proposed. The features were
then extracted from the segmented region and calculated
using GLCM. After feature extraction, the diseases can be
detected with the help of Artificial Neural Networks (ANN)
Figure 1: Septoria leaf spot
or Back Propagation Neural Networks. The drawback of
segmenting the image using K-Means clustering is that
the process proposed was semi-automated as the user has
to explicitly select the cluster which contains the diseased part.

The paper [8] describes a method which makes use of


the Gabor wavelet transformation technique for the purpose
of feature extraction which helps in the disease identification
of tomato leaves. The extracted features were input to the
SVM classifier for training which then determines the type of
disease of the infected tomato leaf. Resizing of the images,
elimination of noise and background removal have been
carried out in the pre-processing stage. The paper has made
Figure 2: Yellow Leaf Curl use of Gabor transformation to identify the textual patterns
of the affected leaf and extract appropriate features. Disease
classification was carried out using SVM with different
kernel functions and performance has been evaluated using
cross-validation technique. An accuracy of 99.5% has been
shown to have achieved according to the experimental results
of the system proposed. The main disadvantage of using
Gabor transformation for feature extraction is that it is
computationally intensive.

In [9], the authors have used a simple approach for the


classification of the diseased tomato leaves into various
classes namely Tomato late blight, Septoria spot, Bacterial
Figure 3: Healthy spot, Bacterial canker, Tomato leaf curl and Healthy. A dataset
of 383 images which have been captured using a digital camera
has been used for the purpose of implementation. Otsu’s
Monitoring a large field of crops is a tedious task, if method for image segmentation has been applied on the
done manually. It is necessary to minimize the human effort dataset. Color features have been obtained using the RGB
put into plant supervision. Hence this is a popular research color components while shape features have been obtained
domain attracting many researchers. Several works related to using regionprops function and texture features have been
plant diseases are observed in literature. obtained from GLCM. All the extracted features have been
combined to form a feature extraction module. Supervised
The authors of the paper [7] have proposed an efficient learning techniques have been used for classification by
method that identifies whether a tomato leaf is healthy or training the decision tree classifier. Though the accuracy is
infected. The image given as input was first pre-processed by high, decision tree has its own set of disadvantages – over
removing the background and the noise present was eliminated fitting in case of noisy data and the amount of control that
with the help of erosion technique. Gray Level Co-occurrence the user has over the model is relatively less.
Matrix (GLCM) was used for texture feature extraction
from the enhanced image. Support Vector Machine (SVM) Deep convolutional neural networks have been trained
classifier was trained using different kernel functions and the in [6] for the identification of 26 diseases in 14 different
performance has been evaluated using N-fold cross-validation crop species. The authors make use of the standard AlexNet
technique. The proposed system has achieved an accuracy [4] and GoogleNet [10] architectures for this purpose. A
of 99.83% using the linear kernel function with the SVM public repository which contains 54,306 images of both
classifier. Even though the obtained accuracy is high, it is not diseased leaves and healthy plant leaves has been used for
sufficient enough to predict or differentiate between healthy this purpose. The dataset has been created by collecting
or diseased leaves. Also, the type of disease was not identified. the images of the plant leaves in a controlled environment.

Authorized licensed use limited to: Cornell University Library. Downloaded on September 02,2020 at 16:36:19 UTC from IEEE Xplore. Restrictions apply.
Proceedings of 2018 Eleventh International Conference on Contemporary Computing (IC3), 2-4 August, 2018, Noida, India

The authors have conducted a performance analysis on both improvement of the numerical condition of the optimization
these architectures by carrying out the model training in problem. It is also made sure that the several default values
two ways. It is performed from scratch in the first case and involved in initialization and termination are appropriate. For
by using transfer learning in the second. Transfer learning our purpose, we normalize the images to get all the pixel
corresponds to the process of adapting pre-trained weights values in the same range by using the mean and the standard
obtained by training models on the ImageNet dataset. The deviation. In machine learning terms, it is called as the Z-score.
model implementation has been carried out using the Caffe
framework giving an accuracy of 99%. This portrays the
feasibility of this approach. However, on testing the trained C. Classification
model against a set of sample test images obtained from
online public data sources which are quite different from Convolutional neural networks (CNN) can be used for
the train set, the model accuracy falls to 31.4%. This is a the creation of a computational model that works on the
common problem faced in neural networks owed to the train unstructured image inputs and converts them to corresponding
and test sets belonging to different distributions. classification output labels. They belong to the category of
multi-layer neural networks which can be trained to learn
The authors of [1] propose an approach where they detect the required features for classification purposes. They require
and classify banana leaf diseases namely Banana sigatoka and less pre-processing in comparison to traditional approaches
Banana speckle. They have performed the training of deep and perform automatic feature extraction which gives better
learning models under certain challenging conditions. These performance. For the purpose of tomato leaf disease detection,
conditions comprise of illumination, complex background, we have experimented with several standard deep learning
different images resolution, size and orientation. They architectures like AlexNet [4], GoogleNet [10] and the best
effectively demonstrate the accuracy of this approach and the results could be seen with the use of a variation of the LeNet
very less computational efforts required. architecture [5].

III. P ROPOSED M ETHODOLOGY LeNet is a simple CNN model that consists of convolutional,
activation, pooling and fully connected layers. The architecture
The proposed approach includes the three important stages used for the classification of the tomato leaf diseases is a
namely: Data Acquisition, Data pre-processing and Classifi- variation of the LeNet model. It consists of an additional
cation. Flow diagram is shown in Fig. 4 and current section block of convolutional, activation and pooling layers in
includes the brief discussions of the same. comparison to the original LeNet architecture. The model
used in this paper been shown in Fig. 5.

Each block consists of a convolutional, activation and a


max pooling layer. Three such blocks followed by fully
connected layers and softmax activation are used in this
architecture. Convolutional and pooling layers are used for
feature extraction whereas the fully connected layers are used
for classification. Activation layers are used for introducing
non-linearity into the network.
Figure 4: Proposed methodology
Convolutional layer applies convolution operation for
extraction of features. With the increase in depth, the
complexity of the extracted features increases. The size of the
A. Data Acquisition filter is fixed to 5 × 5 whereas number of filters is increased
The tomato leaf disease images have been taken from the progressively as we move from one block to another. The
Plant Village repository [5]. Images for the diseases were number of filters is 20 in the first convolutional block while
downloaded using a python script. The acquired dataset con- it is increased to 50 in the second and 80 in the third. This
sists of around 18160 images belonging to 10 different classes. increase in the number of filters is necessary to compensate
The dataset includes images of all major kinds of leaf diseases for the reduction in the size of the feature maps caused by
that could affect the tomato crop. Each of the downloaded the use of pooling layers in each of the blocks. The feature
images belongs to the RGB color space by default and were maps are also zero padded in order to preserve the size of
stored in the uncompressed JPG format. the image after the application of the convolution operation.
The max pooling layer is used for reduction in size of the
feature maps, speeding up the training process, and making
B. Data pre-processing
the model less variant to minor changes in input. The kernel
The acquired dataset consisted of images with minimal size for max pooling is 2 × 2. ReLU activation layer is used in
noise and hence noise removal was not a necessary pre- each of the blocks for the introduction of non-linearity. Also,
processing step. The images in the dataset were resized to Dropout regularization technique has been used with a keep
60 × 60 resolution in order to speed up the training process probability of 0.5 to avoid overfitting the train set. Dropout
and make the model training computationally feasible. regularization randomly drops neurons in the network during
The process of standardizing either the input or target variables each iteration of training in order to reduce the variance of
tends to speed up the training process. This is done through the model and simplify the network which aids in prevention

Authorized licensed use limited to: Cornell University Library. Downloaded on September 02,2020 at 16:36:19 UTC from IEEE Xplore. Restrictions apply.
Proceedings of 2018 Eleventh International Conference on Contemporary Computing (IC3), 2-4 August, 2018, Noida, India

Figure 5: Model architecture

of overfitting. Finally, the classification block consists of two


sets fully connected neural network layers each with 500 and
10 neurons respectively. The second dense layer is followed
by a softmax activation function to compute the probability
scores for the ten classes.

IV. E XPERIMENTAL SETTINGS


The implementation of the proposed methodology has been
carried out on the Plant Village dataset. It consists of around
18160 images belonging to 10 different classes of tomato leaf
diseases. Keras, a neural network API written in Python, has
been used for the model implementation.
Out of the 18160 images, 4800 images were set aside for
testing and 13360 images were used for training. In order to
increase the dataset, automatic data augmentation techniques
has been used by randomly rotating the images by a small
amount of 20 degrees, horizontal flipping, vertical and hori-
zontal shifting of images. The optimization was carried out
using Adam optimizer with categorical cross entropy as the
loss function. Batch size of 20 has been used and the model has Figure 6: Plots of accuracy and loss against epochs
been trained for 30 epochs. The initial learning rate has been
set to 0.01 and it is reduced by a factor of 0.3 on plateau where
the loss stops decreasing. Early stopping has also been used
A highest validation accuracy of 94.8% was obtained over
in order to monitor the validation loss and stop the training
30 epochs of training, while a high 99.3% of training accuracy
process once it increases. All the experiments were performed
was reported. An average validation accuracy of 94% has been
on Intel Core i3-4010U CPU.
obtained. This is an effective measure of the classification
made by the deep learning model. The plots of train and
V. R ESULTS AND A NALYSIS test accuracy and loss against the epochs in Fig. 6 provide
a means of visualization and indication of the speed of model
To evaluate the performance of the proposed model, a set of convergence. It can be seen that the model has stabilized
quantitative metrics comprising of accuracy, precision, recall around 20 epochs and the metrics do not show a significant
and F1-score have been used. The results are reported in Table improvement in the last 10 epochs. The results show that the
1. They show the highest values of the quantitative metrics model performs well on the dataset and can be used as a means
obtained until the corresponding epoch number. for classification of the 10 tomato leaf diseases with minimum
resource requirements.
The implementation process requires minimum hardware re-
TABLE I. RESULTS AND ANALYSIS
quirements unlike large neural networks which generally have
No.of Accuracy Precision Recall F1-Score high computational resource requirements or the use of a
epochs Graphics Processing Unit. This is due to less number of
10 0.9041 0.9012 0.9012 0.9012 training parameters owed to the presence of fewer layers with
20 0.9452 0.9449 0.9449 0.9449
30 0.9485 0.9481 0.9481 0.9481 less filter sizes and smaller train size images. Unlike other state
of the art models, the model implementation can be carried out

Authorized licensed use limited to: Cornell University Library. Downloaded on September 02,2020 at 16:36:19 UTC from IEEE Xplore. Restrictions apply.
Proceedings of 2018 Eleventh International Conference on Contemporary Computing (IC3), 2-4 August, 2018, Noida, India

on CPU with minimum time owing to the simplicity. Also, the [10] Christian Szegedy et al. “Going deeper with convolutions”. In:
variation of the LeNet model adopted is simple to understand Cvpr. 2015.
and easy to implement. The model thus, provides a simple
and effective way of solving the problem of plant disease
detection with results comparative to [6], where the authors
deal with plant diseases of multiple crops. With less resource
constraints and minimal data, the model gives comparative
results to traditional state of the art techniques.

VI. C ONCLUSION AND F UTURE W ORK


Agricultural sector is still one of the most important sector
over which the majority of the Indian population relies on.
Detection of diseases in these crops is hence critical to the
growth of the economy. Tomato is one of the staple crops
which is produced in large quantities. Hence, this paper aims
at detection and identification of 10 different diseases in the
tomato crop. The proposed methodology uses a convolutional
neural network model to classify tomato leaf diseases obtained
from the Plant Village dataset. The architecture used is a
simple convolutional neural network with minimum number
of layers to classify the tomato leaf diseases into 10 different
classes. Different learning rates and optimizers could also be
used for experimenting with the proposed model as a part of
the future work. It could also include experimentation with
newer architectures for improving the performance of the
model on the train set. Thus, the above mentioned model can
be made use of as a decision tool to help and support farmers
in identifying the diseases that can be found in the tomato
plant. With an accuracy of 94-95% the methodology proposed
can make an accurate detection of the leaf diseases with little
computational effort.

R EFERENCES
[1] Jihen Amara, Bassem Bouaziz, Alsayed Algergawy, et al.
“A Deep Learning-based Approach for Banana Leaf Diseases
Classification.” In: BTW (Workshops). 2017, pp. 79–88.
[2] Hui-Ling Chen et al. “Support vector machine based diag-
nostic system for breast cancer using swarm intelligence”. In:
Journal of medical systems 36.4 (2012), pp. 2505–2519.
[3] S. D. Khirade and A. B. Patil. “Plant Disease Detection
Using Image Processing”. In: 2015 International Conference
on Computing Communication Control and Automation. Feb.
2015, pp. 768–771. DOI: 10.1109/ICCUBEA.2015.153.
[4] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton.
“Imagenet classification with deep convolutional neural net-
works”. In: Advances in neural information processing sys-
tems. 2012, pp. 1097–1105.
[5] Yann LeCun et al. “Backpropagation applied to handwritten
zip code recognition”. In: Neural computation 1.4 (1989),
pp. 541–551.
[6] Sharada P Mohanty, David P Hughes, and Marcel Salathé.
“Using deep learning for image-based plant disease detection”.
In: Frontiers in plant science 7 (2016), p. 1419.
[7] Usama Mokhtar et al. “SVM-based detection of tomato
leaves diseases”. In: Intelligent Systems’ 2014. Springer, 2015,
pp. 641–652.
[8] Usama Mokhtar et al. “Tomato leaves diseases detection
approach based on support vector machines”. In: Computer
Engineering Conference (ICENCO), 2015 11th International.
IEEE. 2015, pp. 246–250.
[9] H Sabrol and K Satish. “Tomato plant disease classification
in digital images using classification tree”. In: Communication
and Signal Processing (ICCSP), 2016 International Confer-
ence on. IEEE. 2016, pp. 1242–1246.

Authorized licensed use limited to: Cornell University Library. Downloaded on September 02,2020 at 16:36:19 UTC from IEEE Xplore. Restrictions apply.

You might also like