0% found this document useful (0 votes)
17 views6 pages

Crop Yield Pred Iction Using Regression Model

This research paper discusses the use of regression models, particularly linear regression, for predicting crop yield based on various environmental factors such as temperature and rainfall. The study utilizes machine learning techniques to analyze agricultural data and aims to assist farmers in making informed decisions to optimize crop production. The findings indicate that the model achieved a 75% accuracy in predicting rice crop yield in India, demonstrating the effectiveness of data-driven approaches in agriculture.

Uploaded by

sdfdnimrao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views6 pages

Crop Yield Pred Iction Using Regression Model

This research paper discusses the use of regression models, particularly linear regression, for predicting crop yield based on various environmental factors such as temperature and rainfall. The study utilizes machine learning techniques to analyze agricultural data and aims to assist farmers in making informed decisions to optimize crop production. The findings indicate that the model achieved a 75% accuracy in predicting rice crop yield in India, demonstrating the effectiveness of data-driven approaches in agriculture.

Uploaded by

sdfdnimrao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/363666540

Crop Yield Pred iction using Regression Model

Article in International Journal of Innovative Technology and Exploring Engineering · August 2020
DOI: 10.35940/ijitee.J7491.0891020

CITATIONS READS

6 3,133

3 authors, including:

Veenadhari Suraparaju
Scope Global Skills University
59 PUBLICATIONS 535 CITATIONS

SEE PROFILE

All content following this page was uploaded by Veenadhari Suraparaju on 25 October 2023.

The user has requested enhancement of the downloaded file.


International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075 (Online), Volume-9 Issue-10, August 2020

Crop Yield Prediction using Regression Model


Shikha Ujjainia, Pratima Gautam, S. Veenadhari

Abstract: This research is done to find out the production collect the data. This data further used in various predictions
dependability of crop with various physical circumstances. The like crop yield prediction, soil fertility measures, insect
prediction can also be done of a crop yield by using the model of detection, etc by using machine learning algorithms. The core
regression and it is mainly discussed in this paper. Machine developing element of machine learning are technologies of
learning is an emerging research area in Agriculture, particularly
in crop yield analysis and prediction. There are some complex
big-data and high-performance computing which creates
data which are tough to decode or find by everyone, the strategies opportunities for the field of data-intensive science that is
of machine learning can be used in this scenario and used in the sector of multi-disciplinary agro-technology.
automatically the valuable underlining pattern can be accessed. Among various definitions, ML is described as the logical
Various complex decision-making activities can be performed field that enables machines to learn without being carefully
when the feature of machine learning will enable the knowledge modified [1] [2]. It alludes to the capacity of a machine to
and patterns which are unseen about any problem. The future
events can also be predicted. In the growing season as possible, a
anticipate the result without being expressly customized.
farmer is focused on conceptualizing how much yield they except. There are an enormous number of decisions for ML
Like many other regions, the amount of agricultural data is apparatuses. An application master needs to settle on a
increasing at the daily source. This paper aims to predict crop reasonable decision on a particular ML technique to send for
yield on the collected agricultural dataset. The regression analysis his/her particular issue. We advocate that the specialist
model is used to test the accuracy and effective predictions of the should down select from the plenty of ML decisions
rice crop yield in India. Linear regression is used to establish a
relationship between various environmental variables like
dependent on the sort and measure of accessible information
temperature, rainfall, etc and the crop yield. It is important to and issue detailing [3]. From the perspective of crop yield
measure the possible production of rate of crop and the farmers prediction, we identify the relationship between
will be benefitted by the result of this prediction. As financial environmental variables and crop yield production.
impact is attached of the farmers with the yield production, the Furthermore, the data preprocessing step will play a crucial
research will support them to avoid any loss. The accuracy of the role in successful decision-making. AI strategies which are
prediction through regression model is also observed in this
research paper.
broadly utilized in forecast procedure are boosting methods
(for example RGF, GBDT, and Add support), relapse tree
Keywords: Machine learning, Regression model, Linear (for example ID3, C4.5, and M5-prime relapse tree), straight
regression, Yield prediction.
relapse, arbitrary timberland, bolster vector machine,
k-closest neighbors and fake neural system. Among all these
I. INTRODUCTION
expectation strategies boosting procedures (for example
GBDT, RGF) is as yet immaculate in crop yield forecast [4].
A griculture and its related sectors are undoubtedly the Through this research, the linear regression technique is
largest livelihood provider in India, especially in rural areas. studied and the corresponding analysis is presented.
The main challenge in the field of agriculture is to raise the
grain productivity per unit of land. By the use of emerging
technologies, we can help the farmer to predict the crop yield
or forecast the production of the crop for the next year with
the change of various agro-climatic conditions. The
productivity of the crop is majorly influenced by weather
conditions. Therefore, accurate yield prediction is a major
problem that must be resolved. Internet of things (IoT) makes
sense when we want to collect real-time data, various sensors
like temperature sensors, humidity sensors, soil moisture
sensors, location sensors, etc are inserted into the field to Fig. 1.Machine leaning process
If you want to do prediction or forecasting using machine
learning algorithms, then you must follow the basic steps
Revised Manuscript Received on August 30, 2020. presented in fig.1. In the data collection step, there is the
* Correspondence Author
Shikha Ujjainia*, Department of Computer Science and Application, various source by which we can get data. Next, the data is
Rabindranath Tagore University, Bhopal, India. E-mail: bifurcated by preprocessing methods, in which categorical
[email protected] and null values are handled. Appropriate algorithms is to be
Pratima Gautam, Dean (CSIT), Rabindranath Tagore University,
Bhopal, India. E-mail: [email protected] selected according to build the model. The data is further
S. Veenadhari, Associate Professor (CSE), Rabindranath Tagore divided into training and testing modules to test the accuracy
University, Bhopal, India. E-mail: [email protected] of our model. And the last, results are visualized through flow
© The Authors. Published by Blue Eyes Intelligence Engineering and
chat, CSV format, excel format, or other visualization tools.
Sciences Publication (BEIESP). This is an open access article under the CC
BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Retrieval Number: J74910891020/2020©BEIESP


Published By:
DOI: 10.35940/ijitee.J7491.0891020
Blue Eyes Intelligence Engineering
Journal Website: www.ijitee.org 269 and Sciences Publication
Crop Yield Prediction using Regression Model

II. OBJECTIVES farmers. The linear regression model is formed in the python.
The main objective of this research is to predict the crop A. Data collection
yield with the help of a linear regression process. This will For the study, the statistical information is collected from
help to know the future situation of the production level of Kaggle.com. The dataset consisting of historical data to be
the yield and give ideas to the farmers to avoid the loss. Some taken for rice.
of the specific objectives are: The variety of attributes are regarded as following:
i. To know the relation in between the production level • Area (In Hectare)
of the yield and temperature. The various effects of the • Temperature (Degree Celcius)
temperature and how it controls the crop yield will be • Rainfall (mm)
determined. • Groundwater level (m)
ii. To find the effect of rain in the process of farming and • Soil Ph
how it is affecting the crop yield prediction as well as • Potassium (kg/Hectare)
production. • Magnesium (kg/Hectare)
iii. The area of the yield and the relation of this with the • Sodium (kg/Hectare)
production of the crops will be found in this research.
iv. To determine the effect of ground-level water in B. Data preprocessing and feature extraction
production. The modifications applied before feeding it to the
v. To find the production model’s efficiency and how it algorithm are referred to by preprocessing. Data
can help the process of farming. The differences in the preprocessing is a technique used to convert data into a data
value which are predicted and actually seen in the real collection that is fresh. Additionally, data is gathered from
life are examined. other sources it's collected in a format that isn't possible for
analysis. It's required to data preprocessing for achieving
III. LITRATURE REVIEW outcomes from the applied model in machine-learning.
Feature Extraction is a logically wide procedure where one
In the year of 2011, there was a study by Zaefizadeh with attempts to build up a change of the information space onto
the group of some other researchers about the various ways of the low dimensional subspace that jam a large portion of the
data mining and discussed how the prediction for crop yield significant data [8] [9]. Highlight extraction and
is done by the technologies of data mining. In Ardabil, forty determination techniques are utilized detached or in blend to
genotypes were planted. To do the prediction of grain yield, improve execution, for example, evaluated precision,
the application of artificial neural networks and multiple perception, and intelligibility of scholarly information [10].
linear regression was done [5]. Fifteen neurons along with As a rule, highlights can be sorted as: applicable, immaterial,
one hidden layer were implemented in artificial neural or repetitive. In the component choice procedure, a subset
networks (ANN) [6]. Through the findings of Zaefizadeh, it from accessible highlights information is chosen for the
was found that multiple linear regression outperformed procedure of the learning calculation. The best subset is the
artificial neural networks. one with minimal number of measurements that most add to
The research done by Sanchez in the year of 2014 showed learning precision [11][9].
the correlation between the linear and nonlinear strategies to
do the prediction of crop yield. By performing a complete C. Regression Analysis
algorithm and the percentage split validation, a comparison Regression analysis is a type of predictive modeling
was done and the most useful property subset for each procedure that analyzes the connection between a dependent
strategy was found. The performance was found and or target variable and independent or predictor variable (s).
calculated through test datasets which is consists of an unseen It includes several models, such as linear, multiple linear,
database. The data-driven process of prediction of the crop and non-linear regression. The most common models are
yield is mostly recognized and various methods were simple linear regression and multiple linear regression.
assessed by Sanchez in his research. The research has a Non-linear regression analysis is commonly used for more
various field which can be extended to for the huge number of complicated data set in which the dependent and independent
crop datasets and techniques. Zhang in his research in the variables show a nonlinear relationship.
year 2010 showed the model of linear regression. For the
prediction of crop yield, the utilization of the estimation D. Linear Regression
process of the ordinary least square is used. According to this Linear regression is examined as a procedure that is
research, apart from the temperature, the precipitation utilized to break down a reaction variable Y which changes
contributed to the yield of corn. In the year of 2009, another with the estimation of the intercession variable X. A
research was done by Zaw and Naing and discussed the methodology of anticipating the estimation of a response
model of Polynomial regression model about the prediction variable from a given estimation of the explanatory variable
is referred to as prediction. Here to find the relationship two
of crop yield [7]. This research was done based on Myanmar
variables, one is the dependent variable (Y) and the other one
and to predict the rainfall of that region.
variable that is independent (X) with a best fit straight line is
commonly called as regression line [12]. The regression
IV. METHOD USED equation is shown below,
The objective of this research is to show the impact of
weather parameters and soil parameters on the yield
production to improve the crop yields, which will benefit the

Retrieval Number: J74910891020/2020©BEIESP


DOI: 10.35940/ijitee.J7491.0891020 Published By:
Journal Website: www.ijitee.org Blue Eyes Intelligence Engineering
270 and Sciences Publication
International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075 (Online), Volume-9 Issue-10, August 2020

Y = a + (b*X) + e

Where,
• Y – Dependent variable
• X – Independent variable
• a – Intercept
• b – Slope
• e – Residual (error)

Linear Regression is very sensitive to outliers. This can


greatly affect the regression line and predicted values. One
main reason to select the linear regression is that the
parameters getting, are continuous in nature and linear
regression work best in the continuous variables. Fig. 2. Relation between temperature and yield
If the independent variable has more than one input production.
parameter, multiple regression can be implemented. The
numerical representation of multiple linear regression is :

Y = a + (b*X1) + (c*X2) + (d*X3) + e

Where,
• Y - Dependent variable
• X1, X2, X3 - Independent variables
• a - Intercept
• b, c, d – Slopes
• e – Residual (error)
E. Crop prediction using regression method
Considering weather data (temperature, rainfall), crop data Fig. 3. The relation between rain and yield production.
as the input parameters, and crop yield production as output
parameters.
Step 1: Collect the data. Now transform this raw data into
information. If the raw data is not enough to work with
model, it will be necessary to apply duly designed format data
to the model, to obtain suitable results.
Step 2: Now divide your dataset into two groups i.e.
training and testing dataset. The training dataset is the subset
of your dataset which is used to train your model whereas test
dataset used to test your trained model. The training set will
have a maximum rate of information to train most instances
to produce. About 70% of the samples are collected under the Fig. 4. Relation between area and yield production.
training set. Remaining test dataset uses the information to
check how the model is performing.
Step 3: Apply the linear regression algorithm on a trained
dataset.
Step 4: Calculate model performance by evaluating R2,
RMSE (Root Mean Squared Error).
Step 5: Apply that trained model on the test dataset and
again calculate R2 and RMSE to measure the performance of
the model. Model with the high accuracy and R2 values and
the low RMSE statistics values are considered to be the best
model for the crop yield prediction.
As studied earlier, now following figure is shown,
Fig. 5.The relation between ground-water level and yield
predicting rice crop production with various weather and
production.
crop parameters. It is tried to correlate the independent
variables to yield production which are presented in the The ratio of 75% and 25% is fixed for training and testing
figure: dataset respectively.

Retrieval Number: J74910891020/2020©BEIESP


Published By:
DOI: 10.35940/ijitee.J7491.0891020
Blue Eyes Intelligence Engineering
Journal Website: www.ijitee.org 271 and Sciences Publication
Crop Yield Prediction using Regression Model

This model is achieved R2 with 0.75 i.e. 75% accuracy. R2 is v. The relation between the predicted production
a square of the correlation between predicted target values ‘y’ quantity of the crops with the driving factors and the
and actual target values 'y' which falls in the ranges from 0 to actual production in the real-life scenario was
1. R2 of accurate 1 means the dependent variable exactly determined. The predicted and the real-life numbers
predicted the by from the independent variables, which never were not that much different to conclude the research
happens whereas if the value of R² becomes 0 means as irrelevant rather the predictions made were so close
dependent variable cannot predict by from the independent to the real outcomes. The main finding of this research
variable. So, it always is good for the model to predict the showed that this prediction procedure is very helpful
value of R2 near to 1. for the process of farming and avoiding the loss of the
farmers.

V. CONCLUSION
Food plays a vital role in survival for everyone. Farmers
face lots of difficulties due to various unpredictable reasons.
Hence to overcome the unpredictability of crop production or
other agriculture-related problem we use some prediction
models. The regression model is used as a prediction tool to
predict crop yield.
Thus, in this work, linear regression analysis is used to
establish a relationship between various independent
parameters as explained above and their effects on rice yield,
aim to increase crop productivity by using correct predictions
with the model. Dependency of the production of crop with
Fig. 6 Prediction of model between actual and predicted various parameters like temperature, rain, etc. is also
values. determined in this paper and based on the result the
F. Findings prediction is done for crop yield. This research has shown the
accuracy of the crop yield production while predicted values
A different important matter which are associated with the are compared with the actual production quantity. The
yield productions and drives the quality and quantity of crops problem that was faced by the farmers, the model of
are measured by this research. Driving factors along with the regression is able to give a permanent solution.
predicting methods were also discussed in this research.
Some of the important findings of this research are:
RECOMMENDATIONS
i. The yield production level is dependent on the
temperature of the atmosphere. The farming of crops There are various parts in this research that needs further
is affected due to the fluctuation of temperature in research to know more about crop yield prediction through
different stages. The temperature has the ability to the method of regression analysis. In future analysis can be
control the production level and how it affects crop done with other prediction methods. Support vector
production is also found through this research. The regression is one of the useful analyzation methods which can
temperature was found one of the main driving factors be applied in the same scenarios and various other
in the process of crop yield prediction. information can be gathered for the objective of crop yield
ii. Crop production is also depending on the amount of prediction. There are also some prediction techniques that
rain. Water is really essential for farming and the can be used in future research like Fuzzy Logic, Neural
process of crop yield prediction is also needed to Networks, etc. By using these methods, the yield of various
measure the contribution of it. The rain is important, crops can be predicted. In between the different predictor
but the amount of rain and the proper timing is there variables co-relation can be measured. As the variable is an
for a good level of crop production. This was important part of the process of prediction, that can also be
measured in this research and how this is manipulating found in future researches.
the process of yield production and the process of
measuring this driving element was found. REFERENCES
iii. Yield production is attached to the area of yield where 1. A.L. Samuel, Some Studies in Machine Learning Using the Game of
the crops are being planted. The production level Checkers I, D. N. L. Levy (ed.). New York: Computer Games I, 1959.
direct proportionally changes with the area. Yield with 2. K. Liakos, P. Busato, M. Dimitrios, S Pearson, and D Bochtis, “Machine
more area has the ability to increase the production Learning in Agriculture: A Review,” Sensors, vol. 18, no. 8, pp. 1-29,
August 2018.
level of crops. This research gave clarity about this 3. A. Singh, B. Ganapathysubramanian, A. K. Singh, and S. Sarkar,
factor and it was also found that the production cost “Machine Learning for High-Throughput Stress Phenotyping in Plants,”
and maintenance efforts also increase along with the Trends in Plant Science, vol. 21, no.2, pp. 110-124, February 2016.
increase of area. 4. R. Kumar, M. P. Singh, P. Kumar and J. P. Singh, “Crop Selection
Method to Maximize Crop Yield Rate using Machine Learning
iv. The level of groundwater is also a driving factor in the Technique,” International Conference on Smart Technologies and
process of yield production. If the level of Management for Computing, Communication, Controls, Energy, and
groundwater goes down the crops will not grow Materials (ICSTM), pp. 138-145, May 2015.
properly, and the production level will be affected.
This research shows that there should be a sufficient
level of groundwater for the good production of crop
yield.

Retrieval Number: J74910891020/2020©BEIESP


DOI: 10.35940/ijitee.J7491.0891020 Published By:
Journal Website: www.ijitee.org Blue Eyes Intelligence Engineering
272 and Sciences Publication
International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075 (Online), Volume-9 Issue-10, August 2020

5. Olive, David J. "Multiple linear regression." In Linear Regression, pp.


17-83. Springer, Cham, 2017.
6. Van Gerven, M. and Bohte, S., Artificial neural networks as models of
neural information
processing. Frontiers in Computational Neuroscience, 11, p.114, 2017.
7. Schönbrodt, F., Testing fit patterns with polynomial regression models.,
2016.
8. N. Chumerin and M. Van Hulle, “Comparison of Two Feature
Extraction Methods Based on Maximization of Mutual Information,”
Proc. IEEE Signal Processing Society Workshop on Machine Learning
for Signal Processing, 2006, pp. 343–348.
9. S. Khalid, T. Khalil and S. Nasreen, "A survey of feature selection and
feature extraction techniques in machine learning," Science and
Information Conference, London, pp. 372- 378, August 2014.
10. H. Motoda and H. Liu, “Feature selection, extraction and construction,”
Sixth PacificAsia Conference on Knowledge Discovery and Data
Mining (PAKDD), pp. 67–72, 2002.
11. L. Ladha and T. Deepa, “Feature Selection Methods And Algorithms,”
International Journal on Computer Science and Engineering (IJCSE),
vol. 3, no. 5, pp. 1787-1797, May 2011.
12. P. Surya, and I. L. Aroquiaraj, “ Crop Yield Prediction In Agriculture
Using Data Mining Predictive Analytic Techniques, ” International
Journal of Research and Analytical Reviews (IJRAR), vol. 5, no. 4,
pp. 783-787, December 2018.

AUTHORS PROFILE
Shikha Ujjainia received B.Sc. degree in Computer
Science from Career College, Bhopal, in 2011 and Post
Graduation Degree in Master of Computer Application
from Samrat Ashok Technological Institute (SATI),
Vidisha, in 2014. She is currently undergoing Ph.D.
programme with Rabindranath Tagore University,
Bhopal, Madhya Pradesh. Her research work & interest includes machine
learning, data analytics and data mining. As a research scholar, two research
papers has been published in reputed journal. She is also actively involved
towards online workshops, seminars, and conferences based on IoT,
Machine Learning, and Big Data. She has also actively took part in similar
programs organized by the university.

Dr. Pratima Gautam has PhD in Computer


Application from Department of Computer Application,
Maulana Azad National Institute & Technology, She is
currently working as Professor in Department of
Computer Science & Information Technology,
Rabindranath Tagore University, Bhopal, India. With
more than sixteen years of teaching experience. She has authored several
research articles in International journals of repute. Besides having presented
papers in several international/ national conferences. She has been invited as
an expert to various national conferences as paper reviewer/ program
technical committee member. She has delivered lectures in various
institutions and has also participated in various training programs and
attended several workshop. Her research interests include Data Mining,
Machine Learning, Soft Computing and image processing. She is also
affiliated with international societies like IEEE, , ACM digital library etc She
has associated herself in guiding several under graduate and post graduate
students in their projects and is currently providing Ph.D. supervision to six
research scholars. She is also actively involved in Institutional activities like
organizing Conferences/ Workshops etc.

Dr. S. Veenadhari completed her Doctoral programme


from Mahatma Gandhi Chitrakoot Gramodaya
Vishwavidyalaya in 2015, Master of Computer
Applications from Nagarjuna University, Andhra
Pradesh in 1998 and Master of Technology in Computer
Science and Engineering from Makhan Lal Chaturvedi
University, Bhopal in 2007. Over 15 years of academic and research
experience with 45 research papers published in International and National
reputed journals. Six students are pursuing their Doctoral programme and
two students completed their doctoral degree under her supervision. Her
research interests includes Machine leaning, Big data analytics and Cloud
computing. Her expertise in agricultural data analytics through machine
learning. She has published several book chapters, technical bulletins,
project reports, working papers , training and teaching reference manuals.
She has delivered number invited talks in many national platforms of repute.
She has worked as Expert member in various committees in different
institutions and member of different professional societies like IE, CSI,
ISTE.

Retrieval Number: J74910891020/2020©BEIESP


Published By:
DOI: 10.35940/ijitee.J7491.0891020
Blue Eyes Intelligence Engineering
Journal Website: www.ijitee.org 273
View publication stats
and Sciences Publication

You might also like