ieee honey3
ieee honey3
Suganya E Sankarananth S
Research Scholar Department of Electrical and
Anna University, Electronics Engineering
Chennai Excel College of Engineering and
[email protected] Technology
Tamilnadu,India
[email protected]
Abstract— Honey bee is one of the charming insect that forager activity may leads to sudden alteration in colony level.
utilizes a collective behavioral nature to achieve the powerful In forager activity, entering and exiting the hives of the honey
action. Protecting honey bees is one of the important jobs of bees are closely monitored for a certain time span then the
every human in the world to preserve the ecological balance. related data is gathered without human monitoring. Using
Tracking and determining the several species of the bees over human to monitor the status of honey bee is highly complex
their life span electronically is a tedious work. Automated
though accurate. So automation system using Beehive
classification of species is important to preserve the various
species of honey bees from danger. The diseases that affect the Monitoring System becomes more fabulous. This system
honey bees during their life span have to be detected collects all needed data without disturbing the usual behavior
autonomously and the spread of the diseases to other healthy of bees [3]. The data in the form of audio, video and hive
honey bees has to be preserved. The proposed technique aims in temperature is gathered at constant intervals by EBM.
classifying the several species of honey bees and identifying the In the proposed methodology, convolutional neural
diseases that are prone to honey bees. Convolution neural networks is utilized to classify the species of honey bees and
network with two dimensional layers are used as a classifier in also to correctly identify the diseases that are prone to honey
the proposed model. Data augmentation using Synthetic bees. For recognizing the patterns in two dimensionality
Minority Over-sampling Technique (SMOTE) is utilized. More
procedure, one of the standard Machine Learning procedures
than 5000 images of honey bees with lot of features are used for
learning purpose. The proposed methodology attained an called ConvNets is utilized. It has special network architecture
accuracy of 86% for subspecies classification and 84% for bee with layers of sampling and convolution. A two layered
health identification. Convolutional network model is used in the proposed system.
Keywords— Classification, Convolutional Neural Network, In order to distribute the data in an equal level among all
Synthetic Minority Over-sampling Technique, Beehive categories, data balancing procedure is handled. The data
Monitoring System, Rectified Linear Unit, Visual cortex features balancing procedure utilized in the proposed methodology is
Synthetic Minority Over-sampling Technique (SMOTE).
I. INTRODUCTION
SMOTE uses synthetic sampling procedure to increase the
Like the essentiality of bacteria in day to day life, honey data samples in case of minor subsets. Visual cortex features
bees in turn helps in maintaining the ecological balance and it are used by convolutional networks for classification. Before
is most important in ecology. If there are no honey bees, the inputting to the classifier, image augmentation stage is
pollinated plants will be exhausted within an ample period of handled. Rectified Linear Unit (Relu) activation function is
time. The number of honey bees is gradually decreasing due used in the augmenting phase. The dataset consists of over
to the increased effect of global warming, modern agriculture 5000 images of worker bees from a bee hive with the attributes
and various parasites attack. This in turn leads to the non such as pollen carrying status, name of the sub species, its
ripening of fruits and flowers. Recently lots of advancements health condition along with the time and location. The species
are handled to increase the sustainability of honey bees. With of the honey bees in the datasets are Russian bee, Italian bee,
the evolution of computing technologies and electronic Carniolan bee, western honey bee, mixed local stock, VSH
devices, the monitoring of bee hive electronically and data Italian bee and some other unknown species. The proposed
collection regarding bee health is possible. Sensors and other methodology attained an accuracy of 86% for subspecies
devices are used in the field of ecoacoustics to estimate the classification and 84% for beehealth identification. The rest
unfriendly environments for bees. Forager traffic is one of the of the paper is sectioned as follows: Section 2 describes about
useful variables to supervise the availability of food, age of the various related researches in the proposed area, Section 3
the bee colony and pesticides impact [1]. It also helps in identifies the detailed representation of the proposed method,
evaluating the health of the honey bees [2]. This forager Section 4 discussed about the training and testing results with
activity needs real time monitoring of bee hives, pest detection accuracy and finally Section 5 concludes about the
and other hive management issues. Rapid outbreaks in
this
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on July 01,2021 at 02:30:01 UTC from IEEE Xplore. Restrictions apply.
Fig. 3(b) Balanced dataset distribution for bee health
Fig. 2(b). Raw dataset distribution based on bee health B. Image augmentation
In general, performance of the model and the relation Transformation of images including zooming, flipping,
between accuracy and loss is evaluated using the Receiver rotating and shrinking can be done by image augmentation
Operating Characteristics (ROC) curve. The class that has techniques. Augmentation can be either done before training
more values has to be under sampled, where as the class that or during the phase of training. Former results in data loss,
has fewer values has to be over sampled. In our dataset, the whereas the latter reduces the loss. Here, the augmentation is
Italian honey bee and healthy bee samples has to be over done during training where the network is provided with two
sampled and other class has to be under sampled. The images as input. The input image produces another image that
performance purely depends on the confusion matrix that is masked with a layer, namely, augmented layer, which is
consists of fundamental values to derive accuracy, recall, given as input for classification model. Hence it results in two
precision and f1-score. Synthetic Minority Over-sampling types of losses such as one at the augmented phase and other
TEchnique (SMOTE) is applied to increase the samples in the at the classification phase. Both losses are summed up
minor subsets using over sampling strategy. The SMOTE is together to compute the total loss. The Rectified Linear Unit
chosen for balancing the dataset as most of the classes have (Relu) activation function is used in the augmented network.
less number of samples. The samples are increased by In addition to this, general flipping, rotating and zooming
producing synthetic samples. The nearest neighbors are used operations are performed before feeding the image in the
to produce new samples, where the value of k is chosen in a augmented network.
random manner. The balanced datasets are obtained once
they are processed using SMOTE algorithm available in [3].Convolution Neural Network Based Classification
imbalance package of python library. The balanced datasets Convolutional neural networks utilize the visual cortex
are shown in Fig 3(a) and Fig 3(b). features for classification which can be termed as a special
kind of neural networks. The image gets converted to pixel
values represented in the form of a matrix or array via the
RGB channels that range between 0 and 255. The pixel
intensity is denoted by the values. Initially, the image pixels
get through the CNN and it is scanned from left to right and
top to bottom. The filter is identified in the first step to
identify the convolution. The original pixel values are
multiplied and added together to provide a single value in the
convolution. A matrix is produced by making the filter to
scan through all the value in an image. The ouput of first layer
is the input to the successive layers that acts as a feedback
mechanism. The ReLU activation function is applied in the
nonlinear layer to dense the network. The sampling
operations are performed using Max2D pooling followed by
Fig. 3(a) Balanced dataset for subspecies ReLU. Finally, all the layers are connected to the output layer
that produces a vector with n dimensions. During training, the
number of iteration is given as epochs. The weights are saved
once tainting is completed. The validation is done using the
test data to evaluate the performance of the model. The
architecture of the CNN model used for classification is given
in Fig 4. The CNN layers for sub species classification is
shown in Fig 5(a) and Fig 5(b) respectively. The layer 1 and
2 of the CNN obtained during bee health classification in
shown in Fig 6(a) and Fig 6(b) respectively.
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on July 01,2021 at 02:30:01 UTC from IEEE Xplore. Restrictions apply.
Fig. 4 Architecture of CNN for bee species and bee health identification
(2)
This method of training is termed as saturating nonlinearity
which is less efficient than non saturating non linearity. The
non saturating nonlinearity can be represented as
𝑓(𝑦) = max(0, 𝑦)
(3) Fig. 5(a). CNN layer 0 Bee Subspecies
The ReLU is designed in such a way that it resolves the non
linearity issues. Hence, the model takes negligible amount of
time for training. The conventional models use the general
saturating models that increase the time complexity. Here,
ReLU activation function is used to overcome the limitations
of sigmoid and tangent function of traditional neural network.
The CNN is trained using stochastic gradient descent to
backpropagate the errors. The deep learning model makes use
of backpropagation that has huge datasets with labels. The
linear functions are simple to be handled by neural networks;
hence ReLU is applied to rectify the nonlinearities of the
model.
[5].Softmax Activation Function
The deep learning model classifies the health and
subspecies of honey bees based on logistic function, hence
the softmax function is applied for the multi class
classification. The probabilistic sum of softmax activation
function represents 1. The estimates of maximum likelihood
can be attained using the softmax along with log loss. The
frequencies of the classes are considered to provide a better
output that has high probability values. The main point to use
softmax is that the probability is distributed in all the output
nodes. There will be no improvement in the result if softmax
is used for binary classification but on coming to multi class
classification softmax is the best way to ensure the accuracy
of the model. The softmax activation function can be
represented in the mathematical form as given below
𝑒𝑦
𝑥=
𝑒 0 −𝑒 𝑦
(4)
Fig. 5(b). CNN Layer 2 Bee Subspecies
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on July 01,2021 at 02:30:01 UTC from IEEE Xplore. Restrictions apply.
[6].Dataset Description
The dataset consists of over 5000 images of worker bees
from a bee hive with the attributes such as pollen carrying
status, name of the sub species, its health condition along with
the time and location. The species of the honey bees in the
datasets are Russian bee, Italian bee, Carniolan bee, western
honey bee, mixed local stock, VSH Italian bee and some other
unknown species. The location attribute has different
locations from United States of America (USA). The health
attribute includes healthy bees, missing queen, bee affected
by ant, robbed hives and bees affected by varroa which is a
beetle that affects the bee health. Fig7(a) and Fig 7(b) shows
the sample images from the dataset based on subspecies and
health condition respectively. The dataset is found to be
highly unbalanced, hence split-balance mechanism is applied
to balance the dataset, which in turn overcome the over fitting
problem. The stratified sampling technique is applied for
balancing the training and testing dataset. The image dataset
is augmented using Image Data Generator class of Keras in
Fig. 6(a). CNN Layer 0 Bee Health
python. Then, the dataset is trained using Convolution Neural
Networks (CNN) for both bee subspecies and bee health.
Fig. 6(b). CNN layer 2 Bee health Fig. 7 (b). Data samples of healthy and unhealthy bees
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on July 01,2021 at 02:30:01 UTC from IEEE Xplore. Restrictions apply.
IV. RESULT AND DISCUSSION
The CNN with 2 layers is used to learn and validate the
image dataset that consists of 5000 images. The data is
balanced to reduce the overfitting and the augmentation such
as transformation is performed on the image. The data is split
as 75% for learning and 25% for validation. A proper
distribution of training dataset is ensured as shown in the
dataset distribution. The performance of the model is
validated using false negative, true positive, false positive and
true negative. Using the 25% of testing data, the loss and
accuracy of the model is validated for both subspecies of bees
and health of bees. The complexity of the CNN is provided
with two layers using the keras framework. The size of the
kernel is set to 3 and Rectified Linear Unit (Relu) activation
function is applied. Hence, there is no possibility of gradient
issues and the model has completed training with less Fig. 9(a). Accuracy for bee subspecies
computational and less time complexity. The dense layer and
Max2D pooling is applied along with softmax activation
function, which provides a normalized value. The categorical
cross entropy is utilized to compute the loss and the accuracy
did not improved after 20 epoches. The accuracy and loss
curve for both bee subspecies and bee health is depicted in
Fig 8(a) and Fig 8 (b) respectively. The accuracy of
classifying the different subspecies and healthy bees is
plotted as bar graph in Fig 9(a) and Fig 9(b) respectively for
better visualization.
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on July 01,2021 at 02:30:01 UTC from IEEE Xplore. Restrictions apply.
TABLE 1 PERFORMANCE MEASURES FOR SUBSPECIES CLASSIFICATION and harmonic analysis of buzzing signals.” Engineering Letters, vol. 24,
no. 3, 2016.
Subspecies Precision Recall f1-Score
Unknown species 0.89 0.89 0.86 [5]. V. A. Kulyukin, “In situ omnidirectional vision-based bee counting
using 1d haar wavelet spikes,” in Proceedings of the International
Mixed local stock 0.46 0.93 0.62 MultiConference of Engineers and Computer Scientists, vol. 1, 2017.
Carniolan honey bee 0.97 0.97 0.97
[6]. Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based
Italian honey bee 0.97 0.82 0.89 learning applied to document recognition,” Proceedings of the IEEE,
vol. 86, no. 11, pp. 2278–2324, Nov 1998.
Russian honey bee 0.99 0.97 0.98
VSH Italian honey bee 0.66 0.93 0.77 [7]. Sountharrajan, S., Karthiga, M., Suganya, E., & Rajan, C. (2017).
Automatic classification on bio medical prognosisof invasive breast
Western honey bee 1 1 1 cancer. Asian Pacific Journal of Cancer Prevention: APJCP, 18(9),
Loss Function 0.3312 2541.
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on July 01,2021 at 02:30:01 UTC from IEEE Xplore. Restrictions apply.