0% found this document useful (0 votes)
1K views71 pages

Project Report On Crop Yield Prediction

The document describes a project to predict crop yields using deep learning models like RNN, LSTM and feedforward neural networks. It includes the introduction, literature review on previous research, tools and technologies used, methodology, implementation details, results and conclusion. The goal is to build a neural network model to estimate crop yield given factors like temperature, humidity, wind speed and pressure.

Uploaded by

Ayush singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views71 pages

Project Report On Crop Yield Prediction

The document describes a project to predict crop yields using deep learning models like RNN, LSTM and feedforward neural networks. It includes the introduction, literature review on previous research, tools and technologies used, methodology, implementation details, results and conclusion. The goal is to build a neural network model to estimate crop yield given factors like temperature, humidity, wind speed and pressure.

Uploaded by

Ayush singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

CROP YIELD PREDICTION USING DEEP

LEARNING(RNN, LSTM, FEEDFORWARD)

MAJOR PROJECT REPORT

Submitted in partial fulfillment of the requirements for the award of the degree
of

BACHELOR OF TECHNOLOGY
in

INFORMATION TECHNOLOGY
\
by

Harshit Nimesh Ayush Singh Abhinav Shukla


35196307721 01096303120 00296303120

Guided by

Dr. Sunesh Malik

(HOD, IT DEPARTMENT)

DEPARTMENT OF INFORMATION TECHNOLOGY MAHARAJA SURAJMAL


INSTITUTE OF TECHNOLOGY
(AFFILIATED TO GURU GOBIND SINGH INDRAPRASTHA UNIVERSITY,
DELHI) DELHI – 110058

APRIL 2023
CANDIDATE’S DECLARATION

It is hereby certified that the work which is being presented in the B.Tech Minor
Project report entitled Crop yield Prediction using deep learning (RNN, LSTM,
FEEDFORWARD) in partial fulfillment of the requirements for the award of the
degree of Bachelor of Technology and submitted in the Department of Information
Technology, New Delhi (Affiliated to Guru Gobind Singh Indraprastha
University, New Delhi) is an authentic record of our work carried out during a period
from January 2024 to April 2024 under the guidance of Dr. Sunesh Malik.

The matter presented in the B.Tech. Minor Project Report has not been submitted by
us for the award of any other degree of this or any other institute.

Abhinav Shukla Ayush Singh Harshit Nimesh


(00296303120) (01096303120) (35196307721)

This is to certify that the above statement made by the candidates is correct to the best
of my knowledge. They are permitted to appear in the External Major Project
Examination.

Dr. Sunesh Malik


(HOD, IT Department)

The B.Tech Major Project Viva-Voce Examination of Abhinav Shukla


(00296303120), Harshit Nimesh (35196307721) and Ayush Singh(01096303120)
has been held on 25.04.2024.

Dr. Sitender Malik


(Project Coordinator) (Signature of External
Examiner)

1
ACKNOWLEDGEMENT

We express our deep gratitude to Dr. Sunesh Malik, Head of Department,


Department of Information Technology for her valuable guidance and suggestions
throughout our project work. We are thankful to Dr. Sitender Malik , Project
Coordinator for his valuable guidance.

We would like to extend our sincere thanks to the Head of Department, Dr. Sunesh
Mailk for her time-to-time suggestions to complete our project work. We are also
thankful to Prof. Archana Balyan, Director of MSIT for providing us with the
facilities to carry out our project work.

Abhinav Shukla Ayush Singh Harshit Nimesh


(00296303120) (01096303120) (35196307721)

2
ABSTRACT

The ability to accurately estimate crop yields has emerged as a significant topic of
study in recent years. Farmers who have access to data that indicates which crops will
produce the highest returns on their investments will benefit greatly from this trend.
Farmers can better plan for the harvest thanks to accurate predictions of crop yield. It
also aids in reducing farmers' financial losses due to adverse weather. The goal of the
proposed study is to use a neural network model to estimate crop yield given a set of
suitable crop factors such as minimum and maximum temperatures, humidity, wind
speed, and pressure. This study establishes a model for predicting agricultural yields
using a combination of a Feed FNN and a RNN (recurrent neural network). RMSE
(root mean square error) and Loss were used to assess the efficacy of various neural
network models.

3
TABLE OF CONTENTS

Candidate’s declaration……………………………………………………………….…I
Acknowledgement……………………………………………………………………… II
Abstract……………………………………………...……………………………… III
Table of contents……………………………………………………………………..VI
List of figures…………………………………………………………………………. VII
Chapter 1. Introduction……………………………………………...………………....1
​ 1.1 What is Machine learning……………………………………………...1
​ 1.2 ML VS TP……………………………………………………………..2
1.3 How does ML work…………………………………………………... 3
​ 1.5 Inferring……………………………………………………………….. 4
​ 1.6 ML algorithms and where they are used……...………………………5
Chapter 2. Literature review…………………………………………………………… 6
​ 2.1 Research gap………………………………………………………….9
​ 2.2 Objective….……………………………………………...,.................10
Chapter 3. Tools & Technology………………………………………………………11
​ 3.1 System Requirements……………………………………………….11
​ 3.2 Hardware Requirements…………………………………………….11
​ 3,3 Software Requirements……………………………………………..11
​ 3.4 Software Environment………………………………………………12
​ 3.5 Python……………..………………………………………………...12
​ 3.6 History of python ……..…...……………………………………….12
​ 3.7 Python Features……..………………………………………………13
​ 3.8 Easy to learn …...…………………………………………………...13
​ 3.9 Easy to Read ...……………………………………………………...13
​ 3.10 BSL……...………………………………………………………...13
​ 3.11 Interactive Mode..………………………………………….……...13
​ 3.12 Portable……………………………..……………………….…….13
​ 3.13 Extendable…..……………………………………………….……13
​ 3.14 Database…………………………………………………………...13

4
​ 3.15 GUI Programming…………………………………………………13
​ 3.16 Scalable……………………………………………………………13
​ 3.17 Getting Python……………………………………………………..14
​ 3.18 First Python Program…………………………………………… 14
​ 3.19 Interactive Mode Programming……………………………………14
​ 3.20 Script Mode Programming………………………………………... 15
​ 3.21 Flask Framework 16……………………………………………….17
​ 3.22 What is Python……………………………………………………. 20
​ 3.23 What can Python do………………………………………………. 20
​ 3.24 Why Python………………………………………………………...21
​ 3.25 Python Install……………………………………………………….21
​ 3.26 Tools Used ………………………………………………………25
​ 3.27 Introduction………………………………………………………... 25
Chapter 4. Methodology……………………………………………………………...32
​ 4.1 System Design……………………………………………………… 32
​ 4.1.1 System Architecture……………………………………… 32
​ 4.1.2 Data Flow Diagram………………………………………. 32
​ 4.2 UML Diagrams……………………………………………………... 34
​ 4.2.1 Goals……………………………………………………...34
​ 4.2.2 Use Case Diagram………………………………………. 34
​ 4.2.3 Class Diagram…………………………………………… 36
​ 4.2.4 Sequence Diagram………………………………………..37
​ 4.3 Activity Diagram…………………………………………………… 38
Chapter 5 Implementation And Result………………………………………………39
​ 5.1 Modules……………………………………………………………...39
​ 5.2 Modules Description…………………………...……………………39
​ 5.3 Input Design And Output design……………………..……………..42
​ 5.4 Input Design…………………………………………...…………….42
​ 5.5 Objectives……………………………………………...…………….42
​ 5.6 Output Design…………………………………………...…………...43
​ 5.7 System Study……………………………………………...…………44

5
​ 5.8 Feasibility Study…………………………………………………….44
​ 5.9 Economical Feasibility……………………………………………... 44
​ 5.10 Technical Feasibility……………………………………………….44
​ 5.11 Social Feasibility………………………………………………….. 45
​ 5.12 System Testing……………………………………………………..45
​ 5.13 Types of Tests……………………………………………………... 45
​ 5.13.1 Unit Testing…………………………………………….. 47
​ 5.13.2 Integrating Testing………………………………………47
​ 5.13.3 Acceptance Testing……………………………………...48
​chapter 6 Result……………………………………………………………………… 49
Chapter 7. Conclusion And Limitations……………………………………………...65
Chapter 8. Future Scope……………………………………………………………... 67
References……………………………………………………………………………70

6
LIST OF FIGURES

Fig 1 Traditional Programming…………………………………………………… 2


Fig 2 Machine Learning……………………………………………………………2
Fig 3 Learning Phase of ML………………………………………………………. 3
Fig 4 Inference from Model……………………………………………………….. 4
Fig 5 Machine Learning Algorithms………………………………………………..5
Fig 6 VS Code Logo……………………………………………………………… .25
Fig 7 Lasso Regression…………………………………………………………….38
Fig 8 Linear Regression…………………………………………………………... 41
Fig 9 Model Illustrating steps…………………………………………………….. 33
Fig 10 Testing and Training ……………………………………………………… 35
Fig 11 System Architecture………………………………………………………..36
Fig 12 FLow Diagram……………………………………………………………..37
Fig 13 Use Case Diagram………………………………………………………….38
Fig 14 Sequence Diagram…………………………………………………………40
Fig 15 Activity Diagram…………………………………………………………..42
Fig 16 Front Page………………………………………………………………….64
Fig 17 Login Page…………………………………………………………………65
Fig 18 Password and Email………………………………………………………..66
Fig 19 Upload Section……………………………………………………………..67
Fig 20 Sheet Uploaded…………………………………………………………….68
Fig 21 Training Section…………………………………………………………...69
Fig 22 Trained System……………………………………………………………70
Fig 23 Testing System…………………………………………………………….71
Fig 24 Tested System……………………………………………………………...71
Fig 25 Pie Chart Analysis…………………………………………………………72

7
CHAPTER 1 INTRODUCTION

1.1 What is Machine Learning?


Machine Learning is a system of computer algorithms that can learn from example through
self- improvement without being explicitly coded by a programmer. Machine learning is a
part of artificial Intelligence which combines data with statistical tools to predict an output
which can be used to make actionable insights.

The breakthrough comes with the idea that a machine can singularly learn from the data
(i.e., example) to produce accurate results. Machine learning is closely related to data
mining and Bayesian predictive modeling. The machine receives data as input and uses an
algorithm to formulate answers.

A typical machine learning tasks are to provide a recommendation. For those who have a
Netflix account, all recommendations of movies or series are based on the user's historical
data. Tech companies are using unsupervised learning to improve the user experience with
personalizing recommendation.
Machine learning is also used for a variety of tasks like fraud detection, predictive
maintenance, portfolio optimization, automatize task and so on.

1.2 Deep Learning


● Deep learning is a subset of machine learning that uses multi-layered neural
networks, called deep neural networks, to simulate the complex decision-making power of
the human brain. Some form of deep learning powers most of the artificial intelligence (AI)
in our lives today a deep neural network, or DNN, is a neural network with three or more
layers. In practice, most DNNs have many more layers. DNNs are trained on large amounts
of data to identify and classify phenomena, recognize patterns and relationships, evaluate
possibilities, and make predictions and decisions. While a single-layer neural network can
make useful, approximate predictions and decisions, the additional layers in a deep neural

8
network help refine and optimize those outcomes for greater accuracy.
● Deep learning drives many applications and services that improve automation,
performing analytical and physical tasks without human intervention. It lies behind
everyday products and services—e.g., digital assistants, voice-enabled TV remotes, credit
card fraud detection—as well as still emerging technologies such as self-driving cars and
generative AI.

1.3 Deep Learning vs. Machine Learning

● If deep learning is a subset of machine learning, how do they differ? Deep learning
distinguishes itself from classical machine learning by the type of data that it works with and
the methods in which it learns.

● Machine learning algorithms leverage structured, labeled data to make


predictions—meaning that specific features are defined from the input data for the model
and organized into tables. This doesn’t necessarily mean that it doesn’t use unstructured
data; it just means that if it does, it generally goes through some pre-processing to organize
it into a structured format.

● Deep learning eliminates some of data pre-processing that is typically involved with
machine learning. These algorithms can ingest and process unstructured data, like text and
images, and it automates feature extraction, removing some of the dependency on human
experts. For example, let’s say that we had a set of photos of different pets, and we wanted
to categorize by “cat”, “dog”, “hamster”, et cetera. Deep learning algorithms can determine
which features (e.g. ears) are most important to distinguish each animal from another. In
machine learning, this hierarchy of features is established manually by a human expert.

● Then, through the processes of gradient descent and backpropagation, the deep
learning algorithm adjusts and fits itself for accuracy, allowing it to make predictions about
a new photo of an animal with increased precision.
● Machine learning and deep learning models are capable of different types of learning

9
as well, which are usually categorized as supervised learning, unsupervised learning, and
reinforcement learning.
● Supervised learning utilizes labeled datasets to categorize or make predictions; this
requires some kind of human intervention to label input data correctly. In contrast,
unsupervised learning doesn’t require labeled datasets, and instead, it detects patterns in the
data, clustering them by any distinguishing characteristics. Reinforcement learning is a
process in which a model learns to become more accurate for performing an action in an
environment based on feedback in order to maximize the reward.

1.4 How Deep Learning Works


● Deep learning neural networks, or artificial neural networks, attempts to mimic the
human brain through a combination of data inputs, weights, and bias. These elements work
together to accurately recognize, classify, and describe objects within the data.
● Deep neural networks consist of multiple layers of interconnected nodes, each
building upon the previous layer to refine and optimize the prediction or categorization. This
progression of computations through the network is called forward propagation. The input
and output layers of a deep neural network are called visible layers. The input layer is where
the deep learning model ingests the data for processing, and the output layer is where the
final prediction or classification is made.
● Another process called backpropagation uses algorithms, like gradient descent, to
calculate errors in predictions and then adjusts the weights and biases of the function by
moving backwards through the layers in an effort to train the model. Together, forward
propagation and backpropagation allow a neural network to make predictions and correct for
any errors accordingly. Over time, the algorithm becomes gradually more accurate.

The above describes the simplest type of deep neural network in the simplest terms.
However, deep learning algorithms are incredibly complex, and there are different types of
neural networks to address specific problems or datasets. For example,

Convolutional neural networks (CNNs), used primarily in computer vision and image
classification applications, can detect features and patterns within an image, enabling tasks,
like object detection or recognition. In 2015, a CNN bested a human in an object recognition

10
challenge for the first time.

Recurrent neural network (RNNs) are typically used in natural language and speech
recognition applications as it leverages sequential or times series data.

Deep learning applications


Real-world deep learning applications are a part of our daily lives, but in most cases, they
are so well-integrated into products and services that users are unaware of the complex data
processing that is taking place in the background. Some of these examples include the
following:

● Law enforcement :- Deep learning algorithms can analyze and learn from
transactional data to identify dangerous patterns that indicate possible fraudulent or criminal
activity. Speech recognition, computer vision, and other deep learning applications can
improve the efficiency and effectiveness of investigative analysis by extracting patterns and
evidence from sound and video recordings, images, and documents, which helps law
enforcement analyze large amounts of data more quickly and accurately.

● Financial services :- Financial institutions regularly use predictive analytics to


drive algorithmic trading of stocks, assess business risks for loan approvals, detect fraud,
and help manage credit and investment portfolios for clients.

● Customer service :- Many organizations incorporate deep learning technology into


their customer service processes. Chatbots—used in a variety of applications, services, and
customer service portals—are a straightforward form of AI. Traditional chatbots use natural
language and even visual recognition, commonly found in call center-like menus. However,
more sophisticated chatbot solutions attempt to determine, through learning, if there are
multiple responses to ambiguous questions. Based on the responses it receives, the chatbot
then tries to answer these questions directly or route the conversation to a human user.
● Virtual assistants like Apple's Siri, Amazon Alexa, or Google Assistant extends the
idea of a chatbot by enabling speech recognition functionality. This creates a new method to

11
engage users in a personalized way.

● Healthcare :- The healthcare industry has benefited greatly from deep learning
capabilities ever since the digitization of hospital records and images. Image recognition
applications can support medical imaging specialists and radiologists, helping them analyze
and assess more images in less time.

● Deep learning :- The core objective of machine learning is the learning and
inference. First of all, the machine learns through the discovery of patterns. This discovery
is made thanks to the data. One crucial part of the data scientist is to choose carefully which
data to provide to the machine. The list of attributes used to solve a problem is called a
feature vector. You can think of a feature vector as a subset of data that is used to tackle a
problem.

For instance, the machine is trying to understand the relationship between the wage of an
individual and the likelihood to go to a fancy restaurant. It turns out the machine finds a
positive relationship between wage and going to a high-end restaurant.

1.4 Inferring

When the model is built, it is possible to test how powerful it is on never-seen-before data.
The new data are transformed into a features vector, go through the model and give a
prediction. This is all the beautiful part of machine learning. There is no need to update the
rules or train again the model. You can use the model previously trained to make inference
on new data.

12
Fig 1.4 Inference from model

The life of Machine Learning programs is straightforward and can be summarized in the
following points:

1. Deep learning applications


2. Define a question
3. Collect data
4. Visualize data
5. Train algorithm
6. Test the Algorithm
7. Collect feedback
8. Refine the algorithm
9. Loop 4-7 until the results are satisfying
10. Use the model to make a prediction

Once the algorithm gets good at drawing the right conclusions, it applies that knowledge to
new sets of data.

1.5 Machine Learning Algorithms and where they are used?

13
1.5.1 Convolutional Neural Networks (CNNs) :- Convolutional Neural Networks (CNNs)
are a class of deep neural networks, most commonly applied to analyzing visual imagery.
They have revolutionized various fields such as computer vision, image recognition, and
even natural language processing.

key components and concepts of CNNs:

● Convolutional Layer: This is the core building block of a CNN. In this layer, a set
of learnable filters (also called kernels) slide over the input image, computing dot products
between the filter and localized portions of the input. This operation enables the network to
detect features like edges, textures, and patterns at different spatial locations in the image.

● Pooling Layer: After the convolutional layer, pooling layers are often added to
progressively reduce the spatial size of the representation, reducing the amount of
parameters and computation in the network. Max pooling is a common pooling technique
where the maximum value within a region is selected.

● Activation Function: Typically, CNNs use activation functions like ReLU


(Rectified Linear Unit) to introduce non-linearity into the network, allowing it to learn
complex patterns.

● Fully Connected Layer: After several convolutional and pooling layers, the
high-level reasoning in the neural network is done via fully connected layers. These layers
take the high-level features from the convolutional layers and use them to classify the input
into various classes.

● Softmax Layer: In classification tasks, the softmax layer is often used as the output
layer. It converts the raw scores (output of the previous layer) into probability values,
indicating the likelihood of each class.

● Training: CNNs are typically trained using supervised learning, where the network
learns to map input images to their respective class labels by minimizing a loss function,
such as cross-entropy loss. This is done through backpropagation and optimization
algorithms like Stochastic Gradient Descent (SGD) or Adam.

14
● Transfer Learning: Due to the computational resources required to train deep
CNNs from scratch, transfer learning is a common practice. Pre-trained CNN models, like
VGG, ResNet, or Inception, which have been trained on large datasets like ImageNet, are
fine-tuned on specific tasks with smaller datasets.

● Data Augmentation: To improve generalization and prevent overfitting, data


augmentation techniques are often employed. These techniques involve generating new
training samples through transformations like rotations, flips, and shifts on the original
dataset.

● Applications: CNNs have a wide range of applications including image


classification, object detection, facial recognition, medical image analysis, autonomous
vehicles, and more. They have achieved state-of-the-art performance in many of these tasks.

● Overall, CNNs have significantly advanced the field of computer vision and
continue to be a cornerstone of research and applications in artificial intelligence.

1.5.2 Long Short Term Memory Networks (LSTMs)

Long Short-Term Memory (LSTM) networks are a type of recurrent neural network (RNN)
architecture, designed to address the vanishing gradient problem often encountered in
traditional RNNs. LSTMs are particularly well-suited for sequence prediction problems and
have been widely used in tasks such as language modeling, speech recognition, machine
translation, and more.

Here's a breakdown of the key components and concepts of LSTMs:

● Memory Cells: LSTMs contain memory cells that maintain a cell state, which serves
as the long-term memory of the network. This enables LSTMs to retain information over
long sequences, making them effective for capturing temporal dependencies.
● Gates: LSTMs use three gates to control the flow of information into and out of the
memory cell:
● Forget Gate: Determines which information from the cell state should be discarded
or forgotten. It takes input from the previous hidden state and current input, and outputs a

15
forget gate vector (values between 0 and 1) indicating how much of each cell state element
to forget.
● Input Gate: Determines which new information to store in the cell state. It consists
of two parts:
● Input Gate: Determines which values from the input should be updated. It computes
a candidate value vector.
● Update Gate: Combines the candidate values with the forget gate output to update
the cell state.
● Output Gate: Controls which information from the cell state should be exposed as
the output. It takes input from the previous hidden state and current input, and outputs an
output gate vector that is multiplied element-wise with the cell state.
● Hidden State: LSTMs also have a hidden state that acts as the short-term memory
of the network. It is updated based on the cell state and input at each time step.
● Training: LSTMs are trained using backpropagation through time (BPTT), an
extension of backpropagation that considers the unfolding of the network over time. The
parameters (weights and biases) of the LSTM network are updated to minimize a loss
function, typically using gradient descent-based optimization algorithms like Adam or
RMSProp.
● Bidirectional LSTMs: In some cases, bidirectional LSTMs are used, which process
the input sequence both forwards and backwards. This allows the network to capture
dependencies from both past and future contexts.
● Applications: LSTMs have been successfully applied in various tasks such as:
● Natural Language Processing (NLP): Language modeling, text generation,
sentiment analysis, machine translation.
● Speech Recognition: Phoneme recognition, speech synthesis.
● Time Series Prediction: Stock price prediction, weather forecasting.
● Gesture Recognition: Handwriting recognition, action recognition in videos.

16
1.5.3 Recurrent Neural Networks (RNNs)

Recurrent Neural Networks (RNNs) are a class of neural networks designed to work with
sequential data by maintaining a hidden state that captures information about previous
elements in the sequence. Unlike traditional feedforward neural networks, RNNs have
connections that form directed cycles, allowing them to exhibit temporal dynamics and
handle inputs of variable length.

key components and concepts of RNNs:

● Recurrent Connections: RNNs have recurrent connections that enable them to


maintain information over time by feeding the output of a neuron back to itself as input in
the next time step. This loop allows the network to retain memory and process sequences of
arbitrary length.
● Hidden State: At each time step, an RNN computes a hidden state vector based on
the current input and the previous hidden state. This hidden state serves as the memory of
the network and encodes information about past inputs in the sequence.
● Activation Function: Similar to other neural networks, RNNs use activation
functions (e.g., sigmoid, tanh, ReLU) to introduce non-linearity into the model, enabling it
to learn complex patterns in the data.
● Training: RNNs are typically trained using backpropagation through time (BPTT),
which is an extension of the standard backpropagation algorithm for feedforward networks.
BPTT involves unfolding the network over time and computing gradients at each time step
to update the parameters (weights and biases) of the network.
● Vanishing Gradient Problem: One challenge with training RNNs is the vanishing
gradient problem, where gradients become very small as they propagate back through time,
leading to difficulties in learning long-range dependencies. This problem can be mitigated to
some extent using techniques like gradient clipping, using alternative activation functions,
or employing specialized RNN architectures like Long Short-Term Memory (LSTM)
networks or Gated Recurrent Units (GRUs).
● Applications: RNNs have been successfully applied in various tasks involving
sequential data, including:

17
● Natural Language Processing (NLP): Language modeling, text generation,
sentiment analysis, machine translation.
● Time Series Analysis: Stock market prediction, weather forecasting, physiological
signal analysis.
● Speech Recognition: Phoneme recognition, speech synthesis.
● Video Analysis: Action recognition, video captioning.
● Genomics: DNA sequence analysis, gene expression prediction.

Despite their effectiveness in modeling sequential data, RNNs have limitations such as
difficulties in learning long-range dependencies and challenges in parallelization due to their
sequential nature. However, they serve as the basis for more advanced architectures like
LSTM and GRU, which address some of these limitations while retaining the ability to
process sequential data.

1.5.4 Generative Adversarial Networks (GANs)

Generative Adversarial Networks (GANs) are a class of generative models introduced by


Ian Goodfellow and his colleagues in 2014. GANs consist of two neural networks, the
generator and the discriminator, which are trained simultaneously through a minimax game
framework. GANs are particularly powerful in generating realistic synthetic data, such as
images, audio, text, and more, and have found applications in various fields including
computer vision, natural language processing, and drug discovery.

key components and concepts of GANs:

● Generator: The generator network takes random noise (often sampled from a simple
distribution like Gaussian or uniform) as input and learns to generate synthetic data samples
that resemble the training data. It transforms the input noise into data samples that are
indistinguishable from real data.
● Discriminator: The discriminator network is a binary classifier that learns to
distinguish between real data samples from the training set and fake data samples generated
by the generator. It is trained to assign high probabilities to real data and low probabilities to
fake data.

18
● Adversarial Training: GANs are trained using a minimax game framework, where
the generator and discriminator are trained simultaneously in a competitive manner. The
generator aims to generate realistic data samples to fool the discriminator, while the
discriminator aims to distinguish between real and fake data accurately.
● Loss Functions:
○ Generator Loss: The generator's objective is to minimize the probability that the
discriminator correctly classifies its generated samples as fake. This is typically achieved by
maximizing the log-probability of the discriminator making a mistake, effectively
encouraging the generator to produce realistic samples.
○ Discriminator Loss: The discriminator's objective is to correctly classify real and
fake samples. It aims to minimize the sum of the negative log-probabilities assigned to the
correct labels for both real and fake samples.
○ Training Procedure: During training, the generator and discriminator are updated
alternately. First, the discriminator is trained on a batch of real and fake samples. Then, the
generator is trained to generate more realistic samples to better fool the discriminator. This
process continues iteratively until convergence.
● Mode Collapse: One common challenge in training GANs is mode collapse, where
the generator produces limited varieties of samples or ignores certain modes of the data
distribution. Mode collapse can occur when the generator overfits to a small subset of the
training data or when the discriminator becomes too effective at distinguishing fake
samples.
● Applications: GANs have been applied in various domains, including:
● Image Generation: Generating high-resolution images, image-to-image translation,
style transfer.
● Data Augmentation: Generating synthetic data for training machine learning
models in scenarios with limited data.
● Super-Resolution: Upscaling low-resolution images to higher resolutions.
● Anomaly Detection: Identifying anomalies or outliers in data distributions.
● Drug Discovery: Generating novel molecular structures with desired properties.

Overall, GANs have demonstrated remarkable capabilities in generating high-quality


synthetic data and have spurred significant research and development in the field of

19
generative modeling. However, training GANs can be challenging due to issues like mode
collapse, instability, and hyperparameter sensitivity, which continue to be active areas of
research

1.5.5 Radial Basis Function Networks (RBFNs)

Radial Basis Function Networks (RBFNs) are a type of artificial neural network that uses
radial basis functions as activation functions. Unlike traditional feedforward neural
networks, which typically use sigmoid or ReLU activation functions, RBFNs have a
different architecture and activation mechanism.

The key components and concepts of RBFNs:

● Radial Basis Functions (RBFs): The core of RBFNs lies in the radial basis
functions, which are mathematical functions that depend only on the distance from a
reference point (or center). The most commonly used RBF is the Gaussian function, which
is defined as:
● Architecture: RBFNs typically consist of three layers:
● Input Layer: The input layer receives the input features.
● Hidden Layer: The hidden layer contains radial basis functions, each centered at a
specific point in the input space. The output of each RBF neuron in the hidden layer is
computed based on the distance between the input vector and the center of the RBF.
● Output Layer: The output layer combines the outputs of the hidden layer neurons to
produce the final output. This layer often performs a linear combination of the outputs from
the hidden layer neurons.
● Training: Training RBFNs involves two main steps:
○ Centroid Selection: The centers of the radial basis functions need to be
determined. This can be done using clustering algorithms such as k-means clustering,
where the centers are initialized randomly and then adjusted iteratively to minimize
the overall distance between data points and their assigned cluster centers.
○ Weight Adjustment: Once the centers are fixed, the weights between the
hidden and output layers are adjusted using techniques like least squares regression or
gradient descent to minimize the error between the network's output and the desired

20
output.
● Advantages:
○ RBFNs are particularly effective for function approximation tasks,
interpolation, and pattern recognition.
○ They have a simple and interpretable architecture.
○ RBFNs can provide smooth and continuous outputs due to the use of radial
basis functions.
● Disadvantages:
○ Determining the optimal number and positions of radial basis functions can be
challenging.
○ RBFNs may suffer from overfitting, especially when the number of radial basis
functions is too large relative to the size of the training data.
● Applications: RBFNs have been used in various fields, including:
○ Function approximation and interpolation.
○ Time series prediction.
○ Classification and pattern recognition.
○ Financial forecasting.

Overall, RBFNs offer an alternative approach to traditional neural networks, particularly


suitable for tasks where smooth interpolation or function approximation is desired.
However, they require careful tuning of hyperparameters and may not always generalize
well to unseen data.

21
Chapter 2 LITERATURE REVIEW
1) CRY—an improved crop yield prediction model using bee hive clustering
approach for agricultural data sets
Agricultural researchers over the world insist on the need for an efficient mechanism to
predict and improve the crop growth. The need for an integrated crop growth control
with accurate predictive yield management methodology is highly felt among farming
community. The complexity of predicting the crop yield is highly due to multi
dimensional variable metrics and unavailability of predictive modeling approach, which
leads to loss in crop yield. This research paper suggests a crop yield prediction model
(CRY) which works on an adaptive cluster approach over dynamically updated historical
crop data set to predict the crop yield and improve the decision making in precision
agriculture. CRY uses bee hive modeling approach to analyze and classify the crop based
on crop growth pattern, yield. CRY classified dataset had been tested using Clementine
over existing crop domain knowledge. The results and performance shows comparison of
CRY over with other cluster approaches.

2) An intelligent system based on kernel methods for crop yield


prediction

This paper presents work on developing a software system for predicting crop yield from
climate and plantation data. At the core of this system is a method for unsupervised
partitioning of data for finding spatio-temporal patterns in climate data using kernel
methods which offer strength to deal with complex data. For this purpose, a robust
weighted kernel k-means algorithm incorporating spatial constraints is presented. The
algorithm can effectively handle noise, outliers and auto-correlation in the spatial data,
for effective and efficient data analysis, and thus can be used for predicting oil-palm
yield by analyzing various factors affecting the yield.

3) Fuzzy Logic based Crop Yield Prediction using Temperature and

22
Rainfall parameters predicted through ARMA, SARIMA, and ARMAX
models
Agriculture plays a significant role in the economy of India. This makes crop yield
prediction an important task to help boost India's growth. Crops are sensitive to various
weather phenomena such as temperature and rainfall. Therefore, it becomes crucial to
include these features when predicting the yield of a crop. Weather forecasting is a
complicated process. In this work, three methods are used to forecast- ARMA (Auto
Regressive Moving Average), SARIMA (Seasonal Auto Regressive Integrated Moving
Average) and ARMAX (ARMA with exogenous variables). The performance of the three
is compared and the best model is used to predict rainfall and temperature which are in
turn used to predict the crop yield based on a fuzzy logic model.
4) Crop Yield Prediction Using Data Analytics and Hybrid Approach
Agricultural data is being produced constantly and enormously. As a result, agricultural
data has come in the era of big data. Smart technologies contribute in data collection
using electronic devices. In our project we are going to analyse and mine this agricultural
data to get useful results using technologies like data analytics and machine learning and
this result will be given to farmers for better crop yield in terms.
5) A study on various data mining techniques for crop yield prediction
India is a country where agriculture and agriculture related industries are the major
source of living for the people. Agriculture is a major source of economy of the country.
It is also one of the country which suffer from major natural calamities like drought or
flood which damages the crop. This leads to huge financial loss for the farmers thus
leading to the suicide. Predicting the crop yield well in advance prior to its harvest can
help the farmers and Government organizations to make appropriate planning like
storing, selling, fixing minimum support price, importing/exporting etc. Predicting a crop
well in advance requires a systematic study of huge data coming from various variables
like soil quality, pH, EC, N, P, K etc. As Prediction of crop deals with large set of
database thus making this prediction system a perfect candidate for application of data
mining. Through data mining we extract the knowledge from the huge size of data. This
paper presents the study about the various data mining techniques used for predicting the

23
crop yield. The success of any crop yield prediction system heavily relies on how
accurately the features have been extracted and how appropriately classifiers have been
employed. This paper summarizes the results obtained by various algorithms which are
being used by various authors for crop yield prediction, with their accuracy and
recommendation

6) Crop Yield Prediction Using CNN and RNN model


Evaluating nutrition, health, and development initiatives, as well as guiding government
policy across many sectors, all rely on accurate measurements of food security. This is
also true for directing food and economic relief, providing early famine warning, and
supporting global monitoring networks. The wide variety of methods and instruments
available for gauging food security makes this vital endeavor more difficult. As a result,
we've compiled a review of food security assessment methods, including the
aforementioned difficulties of nomenclature, measurement, and validity. We start with a
discussion of the changing meaning of food security and use it as a framework for a look
at the state of the art in terms of existing measuring techniques for gauging food security.
We evaluate the goals of these instruments, the areas of food security they measure, the
underlying conceptualizations of food security, and the methods employed to validate
them. We detail methods for measuring things like 1) the availability of food in a
country, 2) the effectiveness of food distribution networks, 3) the ease with which
individuals may obtain and consume food, and 4) how much food is wasted. Having
outlined a variety of unresolved measurement issues that could be investigated in
subsequent

7) Syngenta Crop challenge Using CNN and RNN model.


The training data included three sets: crop genotype, yield performance, and environment
(weather and soil). The genotype dataset contained genetic information for all
experimental hybrids, each having 19,465 genetic markers. The yield performance
dataset contained the observed yield, check yield (average yield across all hybrids of the
same location), and yield difference of 148,452 samples for different hybrids planted in
different years and locations. Yield difference is the difference between yield and check

24
yield, and indicates the relative performance of a hybrid against other hybrids at the same
location (Marko et al., 2017). The environment dataset contained 8 soil variables and 72
weather variables (6 weather variables measured for 12 months of each year). The soil
variables included percentage of clay, silt and sand, available water capacity (AWC), soil
pH, organic matter (OM), cation-exchange capacity (CEC), and soil saturated hydraulic
conductivity (KSAT). The weather data provided in the 2018 Syngenta Crop Challenge
were normalized and anonymized. Based on the pattern of the data, we hypothesized that
they included day length, precipitation, solar radiation, vapor pressure, maximum
temperature, and minimum temperature. Part of the challenge was to predict the 2017
weather variables and use them for yield prediction of the same year.

The goal of the 2018 Syngenta Crop Challenge was to predict the performance of corns
in 2017, but the ground truth response variables for 2017 were not released after the
competition. In this paper, we used the 2001–2015 data and part of the 2016 data as the
training dataset (containing 142,952 samples) and the remaining part of the 2016 data as
the validation dataset (containing 5,510 samples). All validation samples were unique
combinations of hybrids and locations, which did not have any overlap with training
data.

2.3 Model Architecture


Due to the nonlinearity and complexity of the features, it is important to build a deep
learning framework for yield prediction. Inspired by the success of CNN and RNN, a
CNN-LSTM network was proposed in the study, which mainly consists of
2-Dimensional Convolutional neural networks (Conv2D) and LSTM networks [55].
CNN can learn the relevant features from an image at different levels similar to a human
brain. An LSTM has the capability of bridging long time lags between inputs over
arbitrary time intervals. The use of LSTM improves the efficiency of depicting temporal
patterns at various frequencies, which is a desirable feature in the analysis of crop
growing cycles with different lengths. The architecture of the proposed CNN-LSTM is
shown in Figure 3. The inputs of the model are the tensors generated from the proposed
GEE-based framework. The output is the predicted soybean yield. Different from

25
traditional pure CNN or pure LSTM architectures, the proposed model mainly has two
components: The first is CNN used for feature extraction, and the second is LSTM,
which is used to learn the temporal features extracted by CNN. The CNN starts from two
Conv2D layers with a kernel size of 1×2 , the first Conv2D has 32 filters and the second
has 64 counterparts. Feature maps are first followed by a batch normalization layer and
then followed by a 2-dimensional max-pooling layer with 1×2 kernel. This improves
generalization, robustness to small distortions, and also reduces dimensionality. Note that
batch normalization is employed after each convolutional layer, which is another method
to regularize a CNN. In addition to providing a regularizing effect, batch normalization
also gives CNN resistance to the vanishing gradient during training, which can decrease
training time and result in better performance [56]. By the TimeDistributed wrapper, the
two stacked Conv2D layers are applied to every temporal slice of the inputs for feature
extraction. Then, each output is flattened and batch normalized successively before they
are fed into an LSTM layer. There is only one LSTM layer in the LSTM part. The
neurons’ number of the LSTM is set to 256, which is followed by a dense layer with 64
neurons. After that, all temporal output is flattened into a long vector, which is sent into a
Dropout layer with 0.5 dropout probability; the dropout layer can randomly turn off a
percentage of neurons during training, which can help prevent groups of neurons from all
overfitting to themselves. Finally, a one neuron dense layer is used to output predicted
yield.

26
2.4 OBJECTIVES
This project aims at predicting the crop yield at a particular weather condition and
thereby recommending suitable crops for that field. It involves the following steps.
● Collect the weather data, crop yield data, soil type data and the rainfall data and
merge these datasets in a structured form and clean the data. Data Cleaning is done to
remove inaccurate, incomplete and unreasonable data that increases the quality of the
data and hence the overall productivity.
● Perform Exploratory Data Analysis (EDA) that helps in analyzing the complete
dataset and summarizing the main characteristics. It is used to discover patterns, spot
anomalies and to get graphical representations of various attributes. Most importantly,
it tells us the importance of each attribute, the dependence of each attribute on the
class attribute and other crucial information.
● Divide the analysed crop data into training and testing sets and train the model
using the training data to predict the crop yield for given inputs.
● Compare various Algorithms by passing the analysed dataset through them and
calculating the error rate and accuracy for each. Choose the algorithm with the highest
accuracy and lowest error rate.
● Implement a system in the form of a mobile application and integrate the algorithm
at the Dept of CSE, CMRIT 2019-2020 Page 4 Crop Yield Prediction based on
Weather using Machine Learning back end.
● Test the implemented system to check for accuracy and failure

2.5 Model data Points:


• Identifying Trends: Look for common methodologies or data types to see what’s
popular in the field or perhaps overly relied on.
•Spotting Gaps: Focus on the 'Research Gaps' column to identify where there are
consistent issues or shortcomings across different studies. This can indicate where new
research could be most valuable.
•Guiding Research: Use the gaps to propose y findings to different crops, regions, or data
type.

27
India is a country where agriculture and agriculture related industries are the major
source of living for the people. Agriculture is a major source of economy of the country.
It is also one of the country which suffer from major natural calamities like drought or
flood which damages the crop. This leads to huge financial loss for the farmers thus
leading to the suicide. Predicting the crop yield well in advance prior to its harvest can
help the farmers and Government organizations to make appropriate planning like
storing, selling, fixing minimum support price, importing/exporting etc. Predicting a crop
well in advance requires a systematic study of huge data coming from various variables
like soil quality, pH, EC, N, P, K etc. As Prediction of crop deals with large set of
database thus making this prediction system a perfect candidate for application of data
mining. Through data mining we extract the knowledge from the huge size of data.
its harvest can help the farmers and Government organizations to make appropriate
planning like storing, selling, fixing minimum support price, importing/exporting etc.
Predicting a crop well in advance requires a systematic study of huge data coming from
various variables like soil quality, pH, EC, N, P, K etc. As Prediction of crop deals with
large set of database thus making this prediction system a perfect candidate for
application of data mining. Through data mining we extract the knowledge from the
huge size of data

28
CHAPTER 3
Tool & Technology

​3.1 SYSTEM REQUIREMENTS:

• 3.1.1 HARDWARE REQUIREMENTS:


• System : INTEL i9 4900k
• Hard Disk : 2TB NVME M.2 SSD.
• Floppy Drive : 1.44 Mb.
• Monitor : LG 49Wl95C 49-Inch Curved 32: 9 Ultrawide Moniter
• Mouse : Logitech MX Master 3.
• Ram : 24GB 6000 mhz DDR5.

• 3.1.2 SOFTWARE REQUIREMENTS:

□ Operating system : Windows 11.


□ Coding Language : Python
□ Database : MYSQL

29
​3.2 SOFTWARE ENVIRONMENT

​3.2.1 Python:

Python is a high-level, interpreted, interactive and object-oriented scripting language.


Python is designed to be highly readable. It uses English keywords frequently where
as other languages use punctuation, and it has fewer syntactical constructions than
other languages.

• Python is Interpreted −Python is processed at runtime by the interpreter. You


do not need to compile your program before executing it. This is similar to PERL and
PHP.

• Python is Interactive −You can sit at a Python prompt and interact with the
interpreter directly to write your programs.

• Python is Object-Oriented −Python supports Object-Oriented style or


technique of programming that encapsulates code within objects.

• Python is a Beginner's Language −Python is a great language for the


beginner-level programmers and supports the development of a wide range of
applications from simple text processing to WWW browsers to games.

3.2.2 History of Python


● Python was developed by Guido van Rossum in the late eighties and early
nineties at the National Research Institute for Mathematics and Computer
Science in the Netherlands.
● Python is derived from many other languages, including ABC, Modula-3, C,
C++, Algol-68, Small Talk, and Unix shell and other scripting languages.
● Python is copyrighted. Like Perl, Python source code is now available under the
GNU General Public License (GPL).
● Python is now maintained by a core development team at the institute, although

30
Guido van Rossum still holds a vital role in directing its progress.
● Python was developed by Guido van Rossum in the late eighties and early
nineties at the National Research Institute for Mathematics and Computer
Science in the Netherlands.
● Python is derived from many other languages, including ABC, Modula-3, C,
C++, Algol-68, Small Talk, and Unix shell and other scripting languages.
● Python is copyrighted. Like Perl, Python source code is now available under the
GNU General Public License (GPL).

3.3.3 Python Features


Python's features include−
● Easy-to-learn −Python has few keywords, simple structure, and a clearly defined
syntax. This allows the student to pick up the language quickly.
● Easy-to-read −Python code is more clearly defined andvisible to the eyes.
● Easy-to-maintain −Python's source code is fairly easy-to-maintain.
● A broad standard library −Python's bulk of the library is very portable and
cross- platform compatible on UNIX, Windows, and Macintosh.
● Interactive Mode −Python has support for an interactive mode which allows
interactive testing and debugging of snippets of code.
● Portable −Python can run on a wide variety of hardware platforms and has the
same interface on all platforms.
● Extendable −You can add low-level modules to the Python interpreter. These
modules enable programmers to add to or customize their tools to be more
efficient.
● Databases −Python provides interfaces to all major commercial databases.
● GUI Programming −Python supports GUI applications that can be created and
ported to many system calls, libraries and windows systems, such as Windows
MFC, Macintosh, and the X Window system of Unix.
● Scalable −Python provides a better structure and support for large programs than
shell scripting.

31
● Apart from the above-mentioned features, Python has a big list of good features,
few are listed below−
● It supports functional and structured programming methods as well as OOP.
● It can be used as a scripting language or can be compiled to byte-code for
building large applications.

3.4.7 Script Mode Programming


Invoking the interpreter with a script parameter begins execution of the script and
continues until the script is finished. When the script is finished, the interpreter is no
longer active.

Let us write a simple Python program in a script. Python files have extension (.py).
Type the following source code in a test.py file.

3.5 Flask Framework:


Flask is a web application framework written in Python. Armin Ronacher, who leads
an international group of Python enthusiasts named Pocco, develops it. Flask is based
on Werkzeug WSGI toolkit and Jinja2 template engine. Both are Pocco projects.
Invoking the interpreter with a script parameter begins execution of the script and
continues until the script is finished. When the script is finished, the interpreter is no
longer active.

It uses English keywords frequently where as other languages use punctuation, and it
has fewer syntactical constructions than other languages.

Http protocol is the foundation of data communication in World Wide Web. Different
methods of data retrieval from specified URL are defined in this protocol

32
The following table summarizes different http methods –

Sr. No Methods & Description

1 GET

Sends data in unencrypted form to the server. Most common


method.

2 HEAD

Same as GET, but without response body

3 POST

Used to send HTML form data to server. Data received by


POST method is not cached by server.

4 PUT

Replaces all current representations of the target resource


with the uploaded content.

5 DELETE

Removes all current representations of the target resource


given by a URL

3.6 Tools Used


3.6.1 Visual Studio Code (VSCODE):

Visual Studio Code, also commonly referred to as VS Code, is a source-code editor

33
made by Microsoft the Electron Framework, for Windows, Linux and macOS]
Features include support for debugging, syntax highlighting, intelligent code
completion, snippets, code refactoring, and embedded Git. Users can change the
theme, keyboard shortcuts, preferences, and install extensions that add additional
functionality.

In the Stack Overflow2021 Developer Survey, Visual Studio Code was ranked the
most popular developer environment tool among 82,000 respondents; with 70%
reporting that they use it

Fig 6 VSCODE

3.6.2 Features:
● Visual Studio Code is a source-code editor that can be used with a variety of
programming languages, including C# , Java, JavaScript, Go, Node.js, Python,
C++, C, Rust and Fortran. It is based on the Electron framework, which is used to
develop Node.js web applications that run on the Blink layout engine. Visual
Studio Code employs the same editor component (codenamed "Monaco") used
Azure DevOps (formerly called Visual Studio Online and Visual Studio Team
Services).
● Out of the box, Visual Studio Code includes basic support for most common
programming languages. This basic support includes syntax highlighting, bracket
matching, code folding, and configurable snippets. Visual Studio Code also ships
with IntelliSense for JavaScript, Type Script, JSON, CSS and HTML, as well as

34
debugging support for Node.js. Support for additional languages can be provided
by freely available extensions on the VS Code Marketplace.
● Instead of a project system, it allows users to open one or more directories, which
can then be saved in workspaces for future use. This allows it to operate as a
language-agnostic code editor for any language. It supports many programming
languages and a set of features that differs per language. Unwanted files and
folders can be excluded from the project tree via the settings. Many Visual Studio
Code features are not exposed through menus or the user interface but can be
accessed via the command palette.
● Visual Studio Code can be extended via extensions available through a central
repository. This includes additions to the editor and language support. A notable
feature is the ability to create extensions that add support for new languages,
themes, debuggers, time travel debuggers, perform static code analysis, and add
code linters using the Language Server Protocol.
● Source control is a built-in feature of Visual Studio Code. It has a dedicated tab
inside of the menu bar where users can access version control settings and view
changes made to the current project. To use the feature, Visual Studio Code must
be linked to any supported version control system (Git, Apache Sub version,
Perforce, etc.). This allows users to create repositories as well as to make push
and pull requests directly from the Visual Studio Code program.

● Visual Studio Code includes multiple extensions for FTP, allowing the software
to be used as a free alternative for web development. Code can be synced
between the editor and the server, without downloading any extra software

35
Chapter 4
METHODOLOGY

​4.1 SYSTEM DESIGN


4.1.1 SYSTEM ARCHITECTURE:

Fig. 11 SYSTEM ARCHITECTURE

4.2 DATA FLOW DIAGRAM:


● The DFD is also called as bubble chart. It is a simple graphical formalism that
can be used to represent a system in terms of input data to the system, various
processing carried out on this data, and the output data is generated by this
system.
● The data flow diagram (DFD) is one of the most important modeling tools. It is
used to model the system components. These components are the system process,
the data used by the process, an external entity that interacts with the system and
the information flows in the system.

36
● DFD shows how the information moves through the system and how it is
modified by a series of transformations. It is a graphical technique that depicts
information flow and the transformations that are applied as data moves from
input to
● DFD is also known as bubble chart. A DFD may be used to represent a system at
any level of abstraction. DFD may be partitioned into levels that represent
increasing information flow and functional detail.

Fig. 12 DFD Diagram

4.3 UML DIAGRAMS


● UML stands for Unified Modelling Language. UML is a standardized
general-purpose modelling language in the field of object-oriented software
engineering. The standard is managed, and was created by, the Object
Management Group.

37
● The goal is for UML to become a common language for creating models of
object oriented computer software. In its current form UML is comprised of two
major components: a Meta- model and a notation. In the future, some form of
method or process may also be added to; or associated with, UML.
● The Unified Modelling Language is a standard language for specifying,
Visualization, Constructing and documenting the artefacts of software system, as
well as for business modelling and other non- software systems.
● The UML represents a collection of best engineering practices that have proven
successful in the modelling of large and complex systems.
● The UML is a very important part of developing objects oriented software and
the software development process. The UML uses mostly graphical notations to
express the design of software projects.

4.4 GOALS:
● The Primary goals in the design of the UML are as follows:
● Provide users a ready-to-use, expressive visual modeling Language so that they
can develop and exchange meaningful models.
● Provide extendibility and specialization mechanisms to extend the core concepts.
● Be independent of particular programming languages and development process.
● Provide a formal basis for understanding the modeling language.
● Encourage the growth of OO tools market.
● Support higher level development concepts such as collaborations, frameworks,
patterns and components.
● Integrate best practices.

4.5 USE CASE DIAGRAM


A use case diagram in the Unified Modeling Language (UML) is a type of behavioral
diagram defined by and created from a Use-case analysis. Its purpose is to present a
graphical overview of the functionality provided by a system in terms of actors, their
goals (represented as use cases), and any dependencies between those use cases. The
main purpose of a use case diagram is to show what system functions are performed

38
for which actor. Roles of the actors in the system can be depicted.

Fig. 13 Use Case Diagram


4.6 CLASS DIAGRAM:
In software engineering, a class diagram in the Unified Modelling Language (UML)
is a type of static structure diagram that describes the structure of a system by
showing the system's classes, their attributes, operations (or methods), and the
relationships among the classes. It explains which class contains information.

39
Fig. 14 Class Diagram

4.7 SEQUENCE DIAGRAM:


A sequence diagram in Unified Modelling Language (UML) is a kind of interaction
diagram that shows how processes operate with one another and in what order. It is a
construct of a Message Sequence Chart. Sequence diagrams are sometimes called event
diagrams, event scenarios, and timing diagrams.

Sequence Diagrams show elements as they interact over time and they are organized
according to object (horizontally) and time (vertically):

40
Fig. 15 Sequence Diagram

4.8 COLLABORATION DIAGRAM:


A collaboration diagram, also known as a communication diagram, is a type of
interaction diagram in Unified Modeling Language (UML) used to visualize the
interactions and relationships between objects in a system or a scenario. It illustrates
how objects communicate with each other to accomplish a specific task or scenario.

41
Fig . 16 Collaboration Diagram

42
Chapter 5 IMPLEMENTATION AND RESULT

5.1 MODULES:

● Data Collection
● Dataset
● Data Preparation
● Model Selection
● Analyze and Prediction
● Accuracy on test set
● Saving the Trained Model

5.2 MODULES DESCRIPTION:

5.2.1 Data Collection:


● This is the first real step towards the real development of a machine learning
model, collecting data. This is a critical step that will cascade in how good the
model will be, the more and better data that we get, the better our model will
perform.
● There are several techniques to collect the data, like web scraping, manual
interventions and etc.
● We attached data set in the document , The file name is crop_modified.csv

5.2.2 Dataset:
● The dataset consists of 2232 individual data. There are 9 columns in the dataset,
which are described below.
● Id: unique id
● State_Name: India is a union of States and Union Territories for the purposes of
administration, India is divided into 29 States
● District_Name: District in india
● Crop_Year: which year crop harvested.
● Season:
*Winter, occurring from December to February. ...
*Summer or pre-monsoon season, lasting from March to May. ...
*Monsoon or rainy season, lasting from June to September. ...

43
* Post-monsoon or autumn season, lasting from October to November.
*Crop: crop name
*Area: how much area? they harvested.
*Production: production amount
*cat_crop: cat_crop name

5.2.3 Data Preparation:

we will transform the data. By getting rid of missing data and removing some
columns. First we will create a list of column names that we want to keep or retain.

Next we drop or remove all columns except for the columns that we want to retain.

Finally we drop or remove the rows that have missing values from the data set.

5.2.4 Model Selection:


The genotype data were coded in {−1, 0, 1} values, respectively representing aa, aA,
and AA alleles. Approximately 37% of the genotype data had missing values. To
address this issue, we used a two-step approach to preprocess the genotype data
before they can be used by the neural network model. First, we used a 97% call rate to
discard genetic markers whose non-missing values were below this call rate. Then we
also discarded genetic markers whose lowest frequent allele's frequency were below
1%, since these markers were less heterozygous and therefore less informative. As a
result, we reduced the number of genetic markers from 19,465 to 627. To impute the
missing data in the remaining part of the genotype data, we tried multiple imputation
techniques, including mean, median, and most frequent (Allison, 2001), and found
that the median approach led to the most accurate predictions. The yield and
environment datasets were complete and did not have missing data.
Once the model is trained, we need to Test the model. For that we will pass test_x to
the predict method.
We trained two deep neural networks, one for yield and the other for check yield, and
then used the difference of their outputs as the prediction for yield difference. These
models are illustrated in Figure 3. This model structure was found to be more

44
effective than using one single neural network for yield difference, because the
genotype and environment effects are more directly related to the yield and check
yield than their difference.

5.2.5 Convolutional Neural Networks (CNN)


● Data Preparation: Ensure your dataset includes images of crops or related visual
data. Each image should be associated with a label indicating the type of crop or
some relevant information.
● Data Preprocessing: Preprocess your image data, which may include resizing
images to a standard size, normalizing pixel values, and augmenting the dataset
with techniques like rotation, flipping, or zooming to increase data diversity.
● Model Architecture: Define your CNN architecture using libraries like
TensorFlow or Keras. Typically, a CNN consists of convolutional layers, pooling
layers, and fully connected layers. You can design a suitable architecture based
on the complexity of your task and the characteristics of your data.
● Training: Train your CNN model on the prepared dataset. Use techniques like
batch training and validation to monitor the model's performance and prevent
overfitting. You can train the model using GPUs to speed up the process if
available.
● Evaluation: Evaluate the trained model on a separate test dataset to assess its
performance metrics such as accuracy, precision, recall, and F1-score. Adjust the
model hyperparameters or architecture as needed to improve performance.
● Deployment: Once satisfied with the model's performance, deploy it to make
predictions on new, unseen data. You can integrate the trained CNN model into
your existing application or create a separate inference pipeline for making
predictions.

5.2.6 Convolutional Neural Networks (CNN)


Using feedforward neural networks (FFNN) involves several key steps, from data
preparation to model evaluation. Here's a comprehensive guide on how to use FFNN
in your project:

45
● Data Preparation:Data Collection: Gather the dataset relevant to your problem
domain. Ensure the data is appropriately labeled.
● Data Preprocessing: Clean the data by handling missing values, outliers, and
formatting inconsistencies. Normalize or scale the features to ensure uniformity
in data distribution.
● Train-Test Split: Divide the dataset into training and testing sets. The training set
is used to train the model, while the testing set evaluates its performance on
unseen data.
● Define Architecture: Design the FFNN architecture, including the number of
layers, the number of neurons in each layer, and the activation functions.
● Compile Model: Compile the model by specifying the loss function, optimizer,
and evaluation metrics.
● Fit Model: Train the FFNN model using the training dataset. Adjust the
hyperparameters like batch size and number of epochs to optimize performance.
● Monitor Training: Monitor the training process using metrics such as loss and
accuracy. Utilize techniques like early stopping to prevent overfitting.
● Evaluate Performance: Evaluate the trained model on the test dataset to assess
its generalization performance. Measure metrics like accuracy, precision, recall,
and F1-score.
● Visualize Results: Visualize the model's performance using plots like confusion
matrices or ROC curves.
● Hyperparameter Tuning: Experiment with different hyperparameters to
fine-tune the model's performance. Use techniques like grid search or random
search for optimization.
● Regularization: Apply regularization techniques like L1/L2 regularization or
dropout to prevent overfitting.
● Deploy Model: Deploy the trained FFNN model in a production environment to
make predictions on new data.
● Integration: Integrate the model into your application or system for real-time
inference.

46
5.2.7 Analyze and Prediction:
In the actual dataset, we chose only 6 features :
● State_Name: India is a union of States and Union Territories for the purposes of
administration, India is divided into 29 States
● District_Name: District in india
● Crop_Year: which year crop harvested.

● Season:
● Winter, occurring from December to February. ...pre-monsoon season, lasting
from March to May.
● Monsoon or rainy season, lasting from June to September.
Post-monsoon or autumn season, lasting from October to November.
● Crop: crop name
● Area: how much area? they harvested.
● Accuracy on test set:
● We got a accuracy of 80.1% on test set.

5.3 INPUT DESIGN AND OUTPUT DESIGN

5.3.1 INPUT DESIGN

The input design is the link between the information system and the user. It comprises
the developing specification and procedures for data preparation and those steps are
necessary to put transaction data in to a usable form for processing can be achieved by
inspecting the computer to read data from a written or printed document or it can
occur by having people keying the data directly into the system. The design of input
focuses on controlling the amount of input required, controlling the errors, avoiding
delay, avoiding extra steps and keeping the process simple. The input is designed in
such a way so that it provides security and ease of use with retaining the privacy.
Input Design considered the following things:

□ What data should be given as input?

47
□ How the data should be arranged or coded?

□ The dialog to guide the operating personnel in providing input.

□ Methods for preparing input validations and steps to follow when error occur.

5.3.2 OBJECTIVES
Input Design is the process of converting a user-oriented description of the input into
a computer-based system. This design is important to avoid errors in the data input
process and show the correct direction to the management for getting correct
information from the computerized system.

1. It is achieved by creating user-friendly screens for the data entry to handle


large volume of data. The goal of designing input is to make data entry easier and to
be free from errors. The data entry screen is designed in such a way that all the data
manipulates can be performed. It also provides record viewing facilities.

2. When the data is entered it will check for its validity. Data can be entered with
the help of screens. Appropriate messages are provided as when needed so that the
user will not be in maize of instant. Thus the objective of input design is to create an
input layout that is easy to follow

5.4 OUTPUT DESIGN

A quality output is one, which meets the requirements of the end user and presents the
information clearly. In any system results of processing are communicated to the users
and to other system through outputs. In output design it is determined how the
information is to be displaced for immediate need and also the hard copy output. It is
the most important and direct source information to the user. Efficient and intelligent
output design improves the system’s relationship to help user decision-making.

1. Designing computer output should proceed in an organized, well thought out


manner; the right output must be developed while ensuring that each output element is

48
designed so that people will find the system can use easily and effectively. When
analysis design computer output, they should Identify the specific output that is
needed to meet the requirements.

2. Select methods for presenting information.


3. Create document, report, or other formats that contain information produced.
5.5 SYSTEM STUDY
5.5.1 FEASIBILITY STUDY

The feasibility of the project is analyzed in this phase and business proposal is put
forth with a very general plan for the project and some cost estimates. During system
analysis the feasibility study of the proposed system is to be carried out. This is to
ensure that the proposed system is not a burden to the company. For feasibility
analysis, some understanding of the major requirements for the system is essential.
Three key considerations involved in the feasibility analysis are
● ECONOMICAL FEASIBILITY
● TECHNICAL FEASIBILITY
● SOCIAL FEASIBILITY

5.5.2 ECONOMICAL FEASIBILITY


This study is carried out to check the economic impact that the system will have on
the organization. The amount of fund that the company can pour into the research and
development of the system is limited. The expenditures must be justified. Thus the
developed system as well within the budget and this was achieved because most of the
technologies used are freely available. Only the customized products had to be
purchased.

5.5.3 TECHNICAL FEASIBILITY


This study is carried out to check the technical feasibility, that is, the technical
requirements of the system. Any system developed must not have a high demand on
the available technical resources. This will lead to high demands on the available

49
technical resources. This will lead to high demands being placed on the client. The
developed system must have a modest requirement, as only minimal or null changes
are required for implementing this system.

5.6 SOCIAL FEASIBILITY


The aspect of study is to check the level of acceptance of the system by the user. This
includes the process of training the user to use the system efficiently. The user must
not feel threatened by the system, instead must accept it as a necessity. The level of
acceptance by the users solely depends on the methods that are employed to educate
the user about the system and to make him familiar with it. His level of confidence
must be raised so that he is also able to make some constructive criticism, which is
welcomed, as he is the final user of the system.

5.7 SYSTEM TESTING


The purpose of testing is to discover errors. Testing is the process of trying to discover
every conceivable fault or weakness in a work product. It provides a way to check the
functionality of components, sub-assemblies, assemblies and/or a finished product It
is the process of exercising software with the intent of ensuring that Software system
meets its requirements and user expectations and does not fail in an unacceptable
manner. There are various types of test. Each test type addresses a specific testing
requirement.

5.8 TYPES OF TESTS

5.8.1 Unit testing


Unit testing involves the design of test cases that validate that the internal program
logic is functioning properly, and that program inputs produce valid outputs. All
decision branches and internal code flow should be validated. It is the testing of
individual software units of the application .it is done after the completion of an
individual unit before integration. This is a structural testing, that relies on knowledge
of its construction and is invasive. Unit tests perform basic tests at component level
and test a specific business process, application, and/or system configuration. Unit

50
tests ensure that each unique path of a business process performs accurately to the
documented specifications and contains clearly defined inputs and expected results

The purpose of testing is to discover errors. Testing is the process of trying to discover
every conceivable fault or weakness in a work product. It provides a way to check the
functionality of components, sub-assemblies, assemblies and/or a finished product It
is the process of exercising software with the intent of ensuring that Software system
meets its requirements and user expectations and does not fail in an unacceptable
manner. There are various types of test. Each test type addresses a specific testing
requirement.

Unit tests perform basic tests at component level and test a specific business process,
application, and/or system configuration. Unit tests ensure that each unique path of a
business process performs accurately to the documented specifications and contains
clearly defined inputs and expected results

The purpose of testing is to discover errors. Testing is the process of trying to discover
every conceivable fault or weakness in a work product. It provides a way to check the
functionality of components, sub-assemblies, assemblies and/or a finished product It
is the process of exercising software with the intent of ensuring that Software system
meets its requirements and user expectations and does not fail in an unacceptable
manner. There are various types of test. Each test type addresses a specific testing
requirement.

5.8.2 Integration testing


Integration tests are designed to test integrated software components to determine if
they run as one program. Testing is event driven and is more concerned with the basic
outcome of screens or fields. Integration tests demonstrate that although the
components were individually satisfaction, as shown by successfully unit testing, the
combination of components is correct and consistent. Integration testing is
specifically aimed at exposing the problems that arise from the combination of
components.

5.8.3 Functional test

51
Functional tests provide systematic demonstrations that functions tested are available
as specified by the business and technical requirements, system documentation, and
user manuals.

Functional testing is centered on the following items:

● Valid Input : identified classes of valid input must be accepted. Invalid


Input : identified classes of invalid input must be rejected.
Functions : identified functions must be exercised.
● Output : identified classes of application outputs must be exercised.
Systems/Procedures: interfacing systems or procedures must be invoked.

Organization and preparation of functional tests is focused on requirements, key


functions, or special test cases. In addition, systematic coverage pertaining to identify
Business process flows; data fields, predefined processes, and successive processes
must be considered for testing. Before functional testing is complete, additional tests
are identified and the effective value of current tests is determined.

5.8.4 System Test

System testing ensures that the entire integrated software system meets requirements.
It tests a configuration to ensure known and predictable results. An example of system
testing is the configuration-oriented system integration test. System testing is based on
process descriptions and flows, emphasizing pre-driven process links and integration
points.

The purpose of testing is to discover errors. Testing is the process of trying to discover
every conceivable fault or weakness in a work product. It provides a way to check the
functionality of components, sub-assemblies, assemblies and/or a finished product It
is the process of exercising software with the intent of ensuring that Software system
meets its requirements and user expectations and does not fail in an unacceptable
manner. There are various types of test. Each test type addresses a specific testing
requirement.

52
Unit tests perform basic tests at component level and test a specific business process,
application, and/or system configuration. Unit tests ensure that each unique path of a
business process performs accurately to the documented specifications and contains
clearly defined inputs and expected results

Black Box Testing is testing the software without any knowledge of the inner
workings, structure or language of the module being tested. Black box tests, as most
other kinds of tests, must be written from a definitive source document, such as
specification or requirements document, such as specification or requirements
document. It is a testing in which the software under test is treated, as a black box
.you cannot “see” into it. The test provides inputs and responds to outputs without
considering how the software works.

The purpose of testing is to discover errors. Testing is the process of trying to discover
every conceivable fault or weakness in a work product. It provides a way to check the
functionality of components, sub-assemblies, assemblies and/or a finished product It
is the process of exercising software with the intent of ensuring that Software system
meets its requirements and user expectations and does not fail in an unacceptable
manner.

5.8.5 White Box Testing

White Box Testing is a testing in which in which the software tester has knowledge of
the inner workings, structure and language of the software, or at least its purpose. It is
purpose. It is used to test areas that cannot be reached from a black box level.

5.8.6 Black Box Testing

Black Box Testing is testing the software without any knowledge of the inner
workings, structure or language of the module being tested. Black box tests, as most
other kinds of tests, must be written from a definitive source document, such as
specification or requirements document, such as specification or requirements
document. It is a testing in which the software under test is treated, as a black box

53
.you cannot “see” into it. The test provides inputs and responds to outputs without
considering how the software works.
5.8.9.1 Test strategy and approach

Field testing will be performed manually and functional tests will be written in detail.

5.9.9.2 Test objectives


● All field entries must work properly.

● Pages must be activated from the identified link.

● The entry screen, messages and responses must not be delayed.

5.8.9.2 Features to be tested


● Verify that the entries are of the correct format

● No duplicate entries should be allowed

● All links should take the user to the correct page

5.8.10 Integration Testing:


Software integration testing is the incremental integration testing of two or more
integrated software components on a single platform to produce failures caused by
interface defects.
The task of the integration test is to check that components or software applications,
e.g. components in a software system or – one step up – software applications at the
company level – interact without error.

5.8.11 Test Results:


All the test cases mentioned above passed successfully. No defects encountered.

5.8.12 Acceptance Testing

54
User Acceptance Testing is a critical phase of any project and requires significant
participation by the end user. It also ensures that the system meets the functional
requirements.

chapter 6 Result

Fig . 17 Front Page

55
Fig . 18 Upload sheet

56
Fig . 19 Pre Processing

Fig . 20 Preprocessed Dataset

57
Fig . 21 RNN Prediction Accuracy

Fig . 22 LSTM Accuracy.

58
Fig . 22 Feed Forward Neural Network Prediction

Fig . 23 Accuracy Comparison Graph

59
Fig . 24 Test Data for Disease detection

Fig . 25 Disease detection

60
Fig . 26 Top 6 State Rainfall and crop yield

Fig . 27 Dataset for Crop yield Prediction

61
Chapter 6

6.1 CONCLUSION AND LIMITATIONS

6.2 CONCLUSION:
When we apply stacked regression, the result has been so improvised than when those
models were applied individually. The output which has been shown in figure is
currently a web application, but our future work would be building an application
where the farmers can use it as app and converting the whole system in their regional
language.

6.3 LIMITATIONS:
6.3.1 Data-Related Issues:
● Availability and Access: Access to high-quality, granular, and comprehensive
agricultural data is often limited. Data privacy issues, proprietary restrictions, and
logistical challenges in data collection can impede access.
● Quality and Granularity: The accuracy and resolution of the data (e.g., weather
data, soil quality data, historical yield data) can significantly impact model
outcomes. Poor quality or coarse data can lead to inaccurate predictions.
● Temporal Coverage: Longitudinal data covering many growing seasons is
crucial for capturing variability in crop yields due to changes in climate, pest
infestations, and other factors. Often, such long-term data sets are unavailable.

6.4 Model Complexity and Selection:


● Overfitting vs. Underfitting: Striking a balance between a model that is overly
complex and one that is too simple is challenging. Overfitting leads to great
performance on training data but poor generalization to new data. Underfitting
occurs when the model cannot capture underlying patterns effectively.
● Choice of Algorithms: The selection of inappropriate modeling techniques for
the data and the task can limit the effectiveness of predictions. Each algorithm
has its strengths and weaknesses depending on the nature of the data and the

62
specific prediction task.

6.5 Scalability and Generalization:


● Scalability Issues: Models that perform well in controlled test conditions or
small datasets may not scale effectively to larger, more diverse datasets.
● Generalization Across Locations and Crops: Models trained on data from
specific regions or specific types of crops may not generalize well to other
regions or crop types due to different environmental and genetic factors.

6.6 Environmental and Biological Factors:


● Complex Interactions: Environmental factors such as climate change, weather
variability, soil health, and pest dynamics interact in complex ways that are
difficult to model accurately.
● Unpredictable Events: Extreme weather events like droughts, floods, and
unexpected pest outbreaks can dramatically affect crop yields and are challenging
to predict with high accuracy.

6.7 Technological and Practical Implementation:


● Integration with Existing Systems: Integrating predictive models into existing
agricultural management systems can be challenging due to compatibility and
operational issues.
● Cost of Implementation: The cost of deploying advanced machine learning
models (including the necessary sensors and data processing infrastructure) can
be prohibitive for small-scale farmers or in developing countries.

6.8 Ethical and Social Considerations:


● Bias in AI: Machine learning models can perpetuate or even exacerbate biases if
the training data is biased. This can lead to unfair or ineffective predictions for
certain groups or regions.
● Impact on Labor: Automating predictions and associated agricultural decisions

63
could have socioeconomic impacts, including job displacement.

● In your project report, discussing these limitations will not only highlight the
challenges faced but also pave the way for suggesting potential improvements
and future research directions. Addressing these limitations thoroughly can
provide a balanced view of what machine learning can and cannot currently
achieve in the field of crop yield prediction.

64
Chapter 7
7.1 FUTURE SCOPE
Discussing the future and scope of a project focused on crop yield prediction using
machine learning is crucial for understanding the potential evolution and impact of
this technology in agriculture. As you draft your project report, consider outlining the
opportunities for advancement and expansion, along with the broader impacts on
agriculture, economics, and society. Here are some key areas to highlight:

7.1.1 Advancements in Machine Learning Techniques


● Computational Learning: As computing power increases and more
sophisticated algorithms are developed, computational learning could provide
significant improvements in modeling complex interactions within agricultural
data.
● Hybrid Models: Combining traditional agricultural models with machine
learning approaches could enhance the accuracy and relevance of predictions by
leveraging the strengths of both methodologies.
● Reinforcement Learning: This could be used to not only predict yields but also
to optimize agricultural practices in real-time, adapting to changing conditions
and learning from new data.

7.2 Enhanced Data Collection and Integration


● Remote Sensing Technology: Advances in satellite imagery, drones, and IoT
devices could improve data collection, providing real-time, high-resolution data
on crop health, soil conditions, and environmental factors.
● Big Data: Leveraging big data technologies to integrate diverse data sources,
including weather patterns, market demands, and genetic information on crops,
can lead to more robust predictive models.

7.3 Broader Application Scope


● Diverse Geographic and Crop Types: Expanding the application of machine

65
learning models to a wider range of geographic areas and different types of crops,
including those less commonly studied, can help improve food security globally.
● Precision Agriculture: Machine learning can play a crucial role in precision
agriculture, enabling farmers to apply the exact amount of resources needed (like
water, nutrients, and pesticides), thus optimizing input usage and minimizing
environmental impact.

7.4 Addressing Global Challenges


● Climate Change: Machine learning models can help predict and mitigate the
impacts of climate change on agriculture by modeling complex scenarios
involving weather extremes and crop adaptability.
● Food Security: Improved yield predictions can directly contribute to better food
supply planning and management, crucial for addressing food security issues,
especially in vulnerable regions.

7.5 Economic Impact and Commercial Viability


● Cost Reduction: As technology advances, the cost of relevant technologies (e.g.,
sensors, AI computation) is likely to decrease, making these innovations more
accessible to a broader range of farmers, including smallholders and those in
developing countries.
● New Markets and Business Models: Machine learning in agriculture could lead
to new business models, such as yield-as-a-service, where companies offer yield
predictions as a service to farmers, or data-driven agricultural consulting.

7.6 Societal and Ethical Implications


● Inclusivity: Ensuring that the benefits of machine learning in agriculture are
accessible to all farmers, including those in low-income countries, is crucial.
● Data Privacy and Security: Developing secure systems to protect farmers' data
and ensuring that the use of such technologies adheres to ethical guidelines is
essential.

66
7.7 Regulatory and Policy Development
● Standards and Guidelines: Establishing international standards and guidelines
for the use of AI in agriculture to ensure safety, reliability, and fairness.
● Supportive Policies: Encouraging policies that support research and deployment
of AI technologies in agriculture, including subsidies, grants, and educational
programs to train farmers and agronomists in high-tech agricultural techniques.

67
REFERENCES

[1] “data.gov.in.” [Online]. Available: https://data.gov.in/


[2] Ananthara, M. G., Arunkumar, T., &Hemavathy, R. (2013,
February).CRY—an improved crop yield prediction model using bee hiveclustering
approach for agricultural data sets. In 2013 International Conference on Pattern
Recognition, Informatics and Mobile Engineering (pp. 473-478).IEEE.
[3] Awan, A. M., & Sap, M. N. M. (2006, April). An intelligent system based on
kernel methods for crop yield prediction. In Pacific-Asia Conference on Knowledge
Discovery and Data Mining (pp. 841-846).Springer, Berlin, Heidelberg.
[4] Bang, S., Bishnoi, R., Chauhan, A. S., Dixit, A. K., &Chawla, I.
(2019,August). Fuzzy Logic based Crop Yield Prediction using Temperature and
Rainfall parameters predicted through ARMA, SARIMA, and ARMAX models. In
2019 Twelfth International Conference on Contemporary Computing (IC3) (pp. 1-6).
IEEE.
[5] Bhosale, S. V., Thombare, R. A., Dhemey, P. G., &Chaudhari, A. N.(2018,
August). Crop Yield Prediction Using Data Analytics and Hybrid Approach. In 2018
Fourth International Conference on Computing Communication Control and
Automation (ICCUBEA) (pp. 1-5).IEEE.
[6] Gandge, Y. (2017, December). A study on various data mining techniques for
crop yield prediction. In 2017 International Conference on Electrical, Electronics,
Communication, Computer, and Optimization Techniques (ICEECCOT) (pp.
420-423).IEEE.
[7] Gandhi, N., Petkar, O., & Armstron0\g, L. J. (20 16, July). Rice crop yield
prediction using artificial neural networks. In 2016 IEEE Technological Innovations
in ICT for Agriculture and Rural Development (TIAR) (pp.105-110).IEEE.
[8] Gandhi, N., Armstrong, L. J., Petkar, O., & Tripathy, A. K. (2016, July).Rice crop
yield prediction in India using support vector machines.In2016 13th International
Joint Conference on Computer Science and Software Engineering (JCSSE) (pp.
1-5).IEEE.
[9] Gandhi, N., Armstrong, L. J., &Petkar, O. (2016, July). Proposeddecision

68
support system (DSS) for Indian rice crop yield prediction.In2016 IEEE
Technological Innovations in ICT for Agriculture and RuralDevelopment (TIAR) (pp.
13-18).IEEE.
[10] Islam, T., Chisty, T. A., &Chakrabarty, A. (2018, December). A DeepNeural
Network Approach for Crop Selection and Yield Prediction inBangladesh. In 2018
IEEE Region 10 Humanitarian TechnologyConference (R10-HTC) (pp. 1-6). IEEE.
[11] Jaikla, R., Auephanwiriyakul, S., &Jintrawet, A. (2008, May). Riceyield
prediction using a support vector regression method. In 2008 5thInternational
Conference on Electrical Engineering/Electronics,Computer, Telecommunications and
Information Technology (Vol. 1,pp. 29-32). IEEE.
[12] Kadir, M. K. A., Ayob, M. Z., &Miniappan, N. (2014, August). Wheatyield
prediction: Artificial neural network based approach. In 2014 4thInternational
Conference on Engineering Technology andTechnopreneuship (ICE2T) (pp. 161-165).
IEEE.
[13] Manjula, A., &Narsimha, G. (2015, January). XCYPF: A flexible
andextensible framework for agricultural Crop Yield Prediction. In 2015IEEE 9th
International Conference on Intelligent Systems and Control(ISCO) (pp. 1-5). IEEE.
[14] Mariappan, A. K., & Das, J. A. B. (2017, April). A paradigm for riceyield
prediction in Tamilnadu. In 2017 IEEE Technological Innovationsin ICT for
Agriculture and Rural Development (TIAR) (pp. 18-21).IEEE.
[15] Paul, M., Vishwakarma, S. K., &Verma, A. (2015, December).Analysis of soil
behaviour and prediction of crop yield using datamining approach. In 2015
International Conference on ComputationalIntelligence and Communication
Networks (CICN) (pp. 766-771).IEEE.
[16] Shah, A., Dubey, A., Hemnani, V., Gala, D., &Kalbande, D. R. (2018).Smart
Farming System: Crop Yield Prediction Using RegressionTechniques. In Proceedings
of International Conference on WirelessCommunication (pp. 49-56).Springer,
Singapore.
[17] Ahamed, A. M. S., Mahmood, N. T., Hossain, N., Kabir, M. T., Das,
K.,Rahman, F., &Rahman, R. M. (2015, June). Applying data miningtechniques to
predict annual yield of major crops and recommendplanting different crops in

69
different districts in Bangladesh. In 2015IEEE/ACIS 16th International Conference on
Software Engineering,Artificial Intelligence, Networking and Parallel/Distributed
Computing(SNPD) (pp. 1-6). IEEE.
[18] Shastry, A., Sanjay, H. A., &Hegde, M. (2015, June). A parameterbased
ANFIS model for crop yield prediction. In 2015 IEEEInternational Advance
Computing Conference (IACC) (pp. 253-257).IEEE.
[19] Sujatha, R., &Isakki, P. (2016, January). A study on crop yieldforecasting
using classification techniques.In 2016 InternationalConference on Computing
Technologies and Intelligent DataEngineering (ICCTIDE'16) (pp. 1-4).IEEE.
[20] Suresh, A., Kumar, P. G., &Ramalatha, M. (2018, October). Predictionof
major crop yields of Tamilnadu using K-means and Modified KNN.In 2018 3rd
International Conference on Communication andElectronics Syst ems (ICCES) (pp.
88-93).IEEE.
[21] Veenadhari, S., Misra, B., & Singh, C. D. (2014, January). Machinelearning
approach for forecasting crop yield based on climaticparameters. In 2014 International
Conference on ComputerCommunication and Informatics (pp. 1-5).IEEE.

70

You might also like