0% found this document useful (0 votes)
31 views64 pages

Minor Projects J

Uploaded by

jiyabasra15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views64 pages

Minor Projects J

Uploaded by

jiyabasra15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Predictive Model for Agriculture

Submitted in partial fulfilment of the requirements for the award of


the degree of

Bachelor of Computer Applications

Submitted to: Submitted by:

Mr. Kanhaiya Lal Sunveen Kaur - 04324402023


Jiya Basra - 03824402023
Batch 2023-26
BCA E1

Institute of Innovation in Technology & Management


New Delhi - 110058

1
ACKNOWLEDGEMENT

We would like to express our heartfelt gratitude to all those who have helped us in the successful

completion of our minor project titled “Predictive Model for Agriculture.”

First and foremost, we would like to extend our sincere thanks to our respected guide and

mentor, [Guide’s Name], for his/her valuable guidance, constant encouragement, and continuous

support throughout the development of this project. His/her insights into Machine Learning,

Data Analysis, and Web Application Development have been invaluable in helping us

successfully design and implement this system.

We are also deeply thankful to the Department of Computer Applications, [Your College

Name], for providing us with the necessary infrastructure, technical resources, and an excellent

academic environment that enabled us to complete this project efficiently.

Finally, we would like to express our gratitude to our family and friends for their constant

support, patience, and motivation, which have been the driving force behind the successful

completion of this project.

– Sunveen Kaur and Jiya Basra

(04324402023 & 03824402023)

BCA E1

2
SELF CERTIFICATE

This is to certify that the dissertation/project report entitled “Predictive Model for Agriculture”

is an authentic work carried out by me in partial fulfilment of the requirements for the award of

the degree of Bachelor of Computer Applications (BCA) under the guidance of Mr. Kanhaiya

Lal.

The matter embodied in this project report is based on my original work and has not been

submitted earlier for the award of any degree or diploma to the best of my knowledge and belief.

Signature of the student

Sunveen Kaur

04324402023

3
SELF CERTIFICATE

This is to certify that the dissertation/project report entitled “Predictive Model for Agriculture”

is an authentic work carried out by me in partial fulfilment of the requirements for the award of

the degree of Bachelor of Computer Applications (BCA) under the guidance of Mr. Kanhaiya

Lal.

The matter embodied in this project report is based on my original work and has not been

submitted earlier for the award of any degree or diploma to the best of my knowledge and belief.

Signature of the student

Jiya Basra

03824402023

4
GUIDE CERTIFICATE

This is to certify that this project entitled “Predictive Model for Agriculture” submitted in partial

fulfilment of the degree of Bachelor of Computer Applications to the “GURU GOBIND SINGH

INDRAPRASTHA UNIVERSITY” through INSTITUTE OF INNOVATION IN

TECHNOLOGY AND MANAGEMENT done by Ms. Sunveen Kaur and Jiya Basra , Roll No.-

04324402023 and 03824402023 is an authentic work carried out by them at under my guidance.

The matter embodied in this project work has not been submitted earlier for award of any degree

to the best of my knowledge and belief.

Signature of the students Signature of the Guide

5
SYNOPSIS

6
1. Name / Title of the Project
“Predictive Model for Agriculture”

2. Statement about the Problem


Agriculture is a key pillar of India’s economy, contributing significantly to national income and
food security. However, the sector faces challenges such as irregular rainfall, unpredictable
weather patterns, poor soil fertility, and lack of data-driven decision support. Farmers often rely
on traditional practices and personal experience, which are not sufficient to ensure consistent
yield in today’s dynamic environment.

Existing agricultural systems lack intelligent tools that can analyze environmental and soil
factors to accurately predict crop productivity. This absence of predictive systems results in
inefficient resource management, uncertainty in yield, and financial instability for farmers.

Hence, there is a growing need for a machine learning–based predictive system that can process
parameters like rainfall, humidity, nitrogen, phosphorus, potassium, and soil pH to estimate the
expected crop yield. Such a model can provide valuable insights that support better planning,
fertilizer optimization, and sustainable farming practices.

3. Why is the Particular Topic Chosen?


The topic “Predictive Model for Agriculture” was chosen because it addresses a real-world
problem faced by millions of farmers — uncertainty in crop yield and lack of technological tools
to predict outcomes.

With the growing accessibility of data and rapid progress in machine learning algorithms, it is
now possible to forecast yield accurately using historical and environmental data. This project
leverages supervised learning and feature selection methods to simplify prediction and deliver
actionable insights through a user-friendly interface.

7
Developing this model allows students and researchers to apply data science concepts in
agriculture, contributing to precision farming and sustainable agricultural innovation. The system
also serves as a cost-effective solution that can be expanded for large-scale deployment in
agricultural monitoring systems.

4. Objective and Scope of the Project


Objectives:

• To design and develop a machine learning model that predicts crop yield based on
environmental and soil parameters.

• To apply feature selection techniques for identifying the most influential factors affecting
yield.

• To develop a Flask-based web interface that allows users to input data and obtain
predictions instantly.

• To evaluate model performance using metrics like MAE, RMSE, and R² for improved
accuracy.

• To demonstrate the real-world application of AI in supporting farmers and agricultural


planners.

Scope:

• The model focuses on predicting yield using variables such as rainfall, humidity,
nitrogen, phosphorus, potassium, and soil pH.

• The application provides visualization of feature importance to enhance interpretability.

• The system can be extended to handle multiple crops and integrate with IoT devices for
real-time data collection.

• Future enhancements include cloud deployment and recommendation modules for


fertilizer or irrigation planning.

8
• The platform is designed to be accessible via web browsers for both farmers and
agricultural institutions.

5. Methodology (Summary of the Project)

The development of this project is structured in multiple stages, as summarized below:

1. Requirement Gathering:
Understanding the parameters affecting crop yield and the needs of farmers. Defining
inputs such as rainfall, humidity, nitrogen, phosphorus, potassium, and soil pH.

2. Data Collection & Preprocessing:


Using synthetic and real agricultural data. Cleaning, normalizing, and transforming the
dataset for compatibility with machine learning models.

3. Model Development:
Implementing the Random Forest Regressor algorithm using Python’s scikit-learn library.
Applying SelectKBest feature selection to identify significant variables influencing yield.

4. Model Evaluation:
Evaluating the model using performance metrics such as MAE, RMSE, and R² score to
measure prediction accuracy.

5. Web Interface Design:


Developing a Flask-based interface where users input parameters and receive real-time
predictions displayed clearly on the screen.

6. Testing & Deployment:


Testing the model and web app for accuracy, stability, and usability. Deploying it locally
with the option for future cloud hosting.

6. Hardware & Software to be Used

Hardware Requirements:
9
• Processor: Intel i3 or above

• RAM: 4GB or higher

• Storage: Minimum 500MB free space

• Device: Desktop / Laptop with Internet access

Software Requirements:

• Frontend: HTML5, CSS3

• Backend: Python (Flask Framework)

• Machine Learning Libraries: scikit-learn, pandas, numpy, joblib

• Database: CSV file / local storage

• IDE: VS Code / PyCharm

• Browser: Google Chrome / Edge

7. Testing Technologies Used

• Unit Testing:
Individual functions such as data input, prediction, and retraining are tested to ensure
proper functionality.

• Integration Testing:
Verifies that the Flask web app, machine learning model, and data flow between frontend
and backend are working together smoothly.

• System Testing:
Tests the complete web application for correctness of prediction results and interface
usability.

• User Acceptance Testing (UAT):


Confirms that the system is user-friendly, stable, and delivers expected outputs.
10
• Cross-browser Testing:
Ensures that the web app works properly on major browsers (Chrome, Edge, Firefox).

8. Contribution of the Project

This project contributes to the digital transformation of agriculture by introducing AI-driven


decision support for yield forecasting. It helps farmers make data-informed choices, improves
resource utilization, and promotes sustainability.

Students and developers gain practical experience in data preprocessing, model building, and
web deployment, bridging the gap between academic learning and real-world applications.

The system sets the foundation for integrating smart agriculture technologies, encouraging future
innovations in precision farming, climate forecasting, and sustainable food production.

11
INDEX

S.NO TITLE P.NO


1. Chapter 1 – Introduction 13-17
• Objectives and Scope of the project
• Theoretical Background Definition of problem
2. Chapter 2 – Software Requirement Specification 18-25
• SDLC Model Used
• PERT Chart
• Functional and Non-Functional Requirements
• Hardware and Software Used

3. Chapter 3 – System Design 26-31


• Block Diagram
• Database Design
• DFD Level 0 and 1
• ERD
• Use Case

4. Chapter 4 – Software Development 32-37


• Data Collection
• Data Preprocessing
• I/O Screens
5. Chapter 5 – Testing 38-45
• Types of Testing
• Levels of Testing
• Form Testing
• Evaluation Criteria

6. Chapter 6 – Conclusion and Future Work 46-64


• Coding & Screenshots of the project
• References

12
Chapter 1
Introduction

13
1.1 Objectives and Scope of the Project

Agriculture is one of the most vital sectors of the Indian economy, providing employment and food
security to a large population. However, farmers often face challenges related to climate change,
soil fertility, and lack of data-driven decision-making. To overcome these challenges, there is a
need for intelligent systems that can assist in predicting crop yield and optimizing agricultural
resources.

The main objective of this project, “Predictive Model for Agriculture,” is to design and develop a
machine learning–based system that can predict crop yield based on soil and environmental
parameters such as rainfall, humidity, nitrogen, phosphorus, potassium, and soil pH.
By analyzing these parameters, the system aims to guide farmers in making informed decisions
related to fertilizer management, irrigation planning, and crop selection.

Objectives of the Project:

1. To develop a predictive model using Machine Learning (Random Forest Algorithm) for
crop yield estimation.

2. To implement feature selection techniques (SelectKBest) to identify the most significant


parameters influencing yield.

3. To create a Flask-based web application that provides an interactive and user-friendly


interface for data input and prediction display.

4. To improve agricultural planning by using data-driven predictions rather than traditional


intuition-based methods.

5. To demonstrate how Artificial Intelligence can support sustainable farming practices and
efficient resource utilization.

Scope of the Project:

The scope of this project includes:

• Developing a web-based system that predicts crop yield using agricultural parameters.
14
• Providing feature importance visualization to show which inputs have the highest impact
on yield.

• Allowing farmers, students, and researchers to use the system for educational or practical
purposes.

• The system can be enhanced with IoT integration for real-time data collection and cloud
deployment for scalability.

• Future developments can include fertilizer recommendations, crop disease prediction, and
weather forecasting integration.

In conclusion, the scope of the project is to bridge the gap between technology and agriculture,
creating a smart and accessible tool that benefits both rural farmers and agricultural analysts.

1.2 Theoretical Background

Machine Learning (ML) is a branch of Artificial Intelligence that enables systems to learn from
data and improve performance without explicit programming. In the context of agriculture, ML
models analyze patterns within environmental and soil data to predict crop yield and other vital
outcomes.

Machine Learning in Agriculture

The use of ML in agriculture has grown significantly due to its ability to handle large datasets and
uncover hidden patterns. Algorithms like Linear Regression, Random Forest, and Neural Networks
have been applied to predict yield, optimize irrigation schedules, and detect plant diseases.
The Random Forest Regressor, used in this project, is an ensemble learning algorithm that builds
multiple decision trees and averages their predictions to improve accuracy and prevent overfitting.
It is ideal for agricultural applications due to its robustness and ability to manage complex data
relationships.

15
Feature Selection

Feature selection is a process of identifying the most relevant variables that influence the output.
In this project, the SelectKBest method is used to find which environmental and soil parameters
have the greatest impact on crop yield.
This step improves model performance and interpretability by focusing only on the key features
such as rainfall, nitrogen, and soil pH.

Web-Based Implementation

The predictive system is implemented using Python’s Flask framework, which connects the trained
ML model to a front-end web interface.
Users can enter their data into a web form, and the system instantly returns the predicted yield.
This approach makes the model accessible, interactive, and easy to use for farmers and agricultural
institutions.

Overall, the theoretical background of this project combines the fields of data science, machine
learning, and web development to create a practical, AI-driven agricultural tool.

1.3 Definition of Problem

Agriculture today faces increasing pressure due to unpredictable weather, limited resources, and
growing demand. Traditional methods of yield prediction rely on manual observation or outdated
data, which are often inaccurate and unreliable.
Farmers lack accessible, intelligent tools to help them estimate potential yield based on soil and
environmental conditions.

The problem this project addresses is the absence of a smart, automated system that can analyze
multiple agricultural parameters and provide accurate yield predictions. Without such a system:

• Farmers are unable to plan resources effectively.

• There is no quantitative way to estimate productivity before harvesting.

• Agricultural decisions often depend on assumptions rather than data.

16
The Predictive Model for Agriculture solves this problem by using supervised machine learning
techniques to predict yield, identify the most influential factors, and present results through a
simple web interface.
This approach empowers farmers with actionable insights, reduces uncertainty, and promotes
sustainable farming through the use of technology.

17
Chapter 2

Software Requirement Specification

18
2.1 SDLC Model Used

Software Development Life Cycle (SDLC) provides a systematic process to design, develop, and

deliver software of high quality within a defined time and budget.

For this project, the Waterfall Model was selected as the most appropriate SDLC approach.

The Waterfall Model is a linear and sequential model where each phase must be completed

before the next begins. It ensures proper documentation, disciplined development, and clear

deliverables — ideal for academic projects with well-defined requirements like this one.

Phases of the Waterfall Model

1. Requirement Analysis:

o Identify what the system needs to accomplish.

o Understand user expectations such as predicting crop yield, providing an easy-to-

use web interface, and ensuring fast, accurate output.

2. System Design:

o Create the architectural design of the predictive model.

o Decide how the user interface, model, and backend interact.

o Choose libraries (scikit-learn, pandas, numpy) and framework (Flask).

3. Implementation:

o Develop the machine learning model (Random Forest Regressor).

o Integrate it with the Flask web server and HTML interface.

4. Testing:

19
o Perform unit, integration, and system testing to verify accuracy and usability.

o Ensure predictions are valid and interface functions correctly.

5. Deployment:

o Host the Flask application on a local server.

o Prepare the model for cloud deployment on Heroku or AWS for scalability.

6. Maintenance:

o Update the model periodically with new datasets.

o Maintain the web interface for usability and efficiency.

Advantages of Using Waterfall Model

• Easy to understand and manage.

• Phases are clearly defined with fixed deliverables.

• Ensures good documentation at every stage.

• Ideal for small to medium-scale projects like this one.

The Waterfall model ensures that each stage of the Predictive Model for Agriculture is developed

systematically, minimizing confusion and ensuring smooth progress from requirement to

deployment.

2.2 PERT CHART

20
The Program Evaluation and Review Technique (PERT) is used to plan, schedule, and control

project activities.

It provides a graphical representation of tasks and their dependencies, ensuring that the project is

completed efficiently within the given timeframe.

The PERT chart helps in identifying critical tasks, sequencing work, and estimating total time

required for development.

Task Description Duration(Days) Description

1 Requirement 3 Understanding
system goals and user
Analysis needs.

2 System Design 4 Designing model


architecture and
database schema.
3. 7
Model Development Implement Random
Forest algorithm and
feature selection.
4. 4 Build user interface
Web Interface
using HTML, CSS,
Design
Flask.
5. 3 Connect frontend
Integration
with backend model.
6. Testing 5 Perform unit,
integration, and
system testing.
7. Documentation & 4 Prepare
documentation and
Deployment deploy the app.

21
The PERT chart helped in efficient scheduling and timely completion of each development
stage.

2.3 FUNCTIONAL REQUIREMENTS

Functional requirements describe what the system is expected to do.


They define the key operations and functionalities that ensure the system performs as
intended.

1. User Input:
Users can enter agricultural parameters such as rainfall (mm), humidity (%), nitrogen,
phosphorus, potassium (kg/ha), and soil pH into the system.

2. Data Validation:
The system checks that all entered values are numeric and within valid ranges before
processing.

3. Crop Yield Prediction:


The trained Random Forest Regressor model processes the inputs and predicts the
expected crop yield (in quintals per hectare).

4. Feature Importance Display:


The system highlights which features (e.g., rainfall, nitrogen, soil pH) most strongly
influenced the prediction result.

5. Retraining Model:
Users can retrain the machine learning model using new datasets to improve accuracy.

6. Result Visualization:
The prediction result is displayed clearly on the web page, supported by small graphs or
bars indicating the importance of each feature.

7. Error Handling:
If any field is left blank or contains invalid input, the system will display an appropriate
error message.

22
8. Web Interface:
A clean, interactive web interface is created using Flask, HTML, and CSS, allowing
users to easily input data and view predictions.

9. Local Data Saving (Optional):


Predicted results can be saved locally for future analysis or academic demonstration
purposes.

2.4 NON-FUNCTIONAL REQUIREMENTS

Non-functional requirements describe how the system performs rather than what it does.
They define quality attributes such as speed, usability, reliability, and scalability.

1. Performance:
The system should provide fast and accurate predictions (within 3–5 seconds per
request).

2. Usability:
The application must be simple, clean, and user-friendly, suitable for both students and
agricultural professionals.

3. Reliability:
The prediction results should remain consistent for the same input data, ensuring model
stability.

4. Scalability:
The system should support larger datasets and new crop types in the future without major
code changes.

5. Security:
Input data is handled securely; no sensitive information is stored or transmitted
externally.

23
6. Maintainability:
The project’s code structure is modular and well-documented, allowing easy updates,
debugging, and retraining.

7. Portability:
The web application should run smoothly on different systems (Windows 10/11) and
browsers (Chrome, Edge, Firefox).

8. Availability:
When hosted locally or on a cloud platform, the system should be available for use
anytime with minimal downtime.

9. Accuracy:
The predictive model must maintain an accuracy level of at least 85–90% for most test
cases.

10. Aesthetic Quality:


The interface design should be visually appealing with a consistent layout, professional
colors, and easy readability.

2.5 Hardware And Software Used

Hardware Requirements

Component Specification

Processor Intel Core i3 or above

RAM Minimum 4 GB

Hard Disk Minimum 500 GB

Input Devices Keyboard. Mouse

Output Devices Monitor/ Display

24
Software Requirements

Software Description

Operating System Window 10 or 11

Programming Language Python 3.10 or above

Framework Flask (for backend web server)

Libraries scikit-learn, pandas, numpy, joblib

Frontend Technologies HTML, CSS3

Database CSV file

Browser Google Chrome/ Edge

Deployment Platform LocalHost

25
Chapter 3

System Design

26
3.1 Block Diagram
The block diagram shows the overall flow of the system. The user enters inputs, the backend
processes them, the ML model predicts the yield, and the result is displayed.

3.2 Database Design


This table is used to store user inputs and predicted results for analysis or future training.

27
3.3 DFD Level 0
DFD Level 0 shows the whole system as a single process interacting with the user.

28
DFD Level 1
DFD Level 1 breaks the system into smaller sub-processes: validation, preprocessing, prediction,
and output generation.

3.4 ER Diagram (ERD)


The ERD shows the relationship between user inputs and prediction records.

29
3.6 Use Case Diagram
The use case diagram shows how the user interacts with the system and what tasks they can
perform.

30
31
Chapter 4

Software Development

32
4.1 Data Collection

Data collection plays a vital role in developing an accurate predictive model.


For the Predictive Model for Agriculture, data was collected from multiple sources to represent
different soil and environmental conditions that influence crop yield.

The system uses datasets containing parameters such as rainfall, humidity, nitrogen, phosphorus,
potassium, soil pH, and crop yield.
Since real agricultural data is not easily accessible for small-scale academic projects, both
synthetic data generation and public datasets were used.

4.1.1 Sources of Data

1. Synthetic Dataset:

o Generated using Python libraries such as NumPy and Pandas.

o Simulated realistic agricultural conditions for multiple regions.

o Data included soil composition, rainfall, and weather conditions.

2. Public Datasets:

o Kaggle: Crop Yield Prediction Dataset

o ICAR (Indian Council of Agricultural Research): Soil data references.

o FAO (Food and Agriculture Organization): Yield benchmarks.

o Weather data from OpenWeatherMap API for rainfall and humidity patterns.

4.1.2 Data Description

33
Each record represents environmental and soil conditions for a given crop cycle and its resulting
yield.

4.2 Data Pre-Processing

Data pre-processing prepares the collected data for model training.


Raw data often contains missing values, outliers, or inconsistent scales, which must be cleaned
before feeding into a machine learning model.

4.2.1 Steps in Data Pre-Processing

1. Data Cleaning:

o Removed missing or null entries using Pandas functions.

o Duplicates and incorrect values were filtered out.

2. Handling Missing Values:

o Replaced missing values with mean or median of the respective column.

3. Feature Scaling:

34
o Normalized the data using MinMaxScaler() from scikit-learn.

o Ensures that large variations (e.g., rainfall vs. pH) don’t dominate the model.

4. Feature Selection:

o Used SelectKBest (with f_regression) to identify most influential factors.

o The most impactful features identified were Nitrogen, Rainfall, and pH.

5. Splitting the Dataset:

o The dataset was divided into 80% training and 20% testing sets using
train_test_split().

4.3 Input and Output Screens (I/O Screens)

Input and Output screens define how the user interacts with the system.
This system provides a simple, interactive, and clean web interface created using Flask, HTML,
and CSS.

4.3.1 Input Screen

Description:
The input screen allows users to provide necessary agricultural parameters for prediction.
Each field has validation checks to ensure only valid numeric values are entered.

Input Fields:

• Rainfall (mm)

• Humidity (%)

• Nitrogen (kg/ha)

• Phosphorus (kg/ha)

• Potassium (kg/ha)

35
• Soil pH

HTML FORM EXAMPLE:

<form action="/predict" method="post">

<label>Rainfall (mm):</label><input type="number" name="rainfall"><br>

<label>Humidity (%):</label><input type="number" name="humidity"><br>

<label>Nitrogen (kg/ha):</label><input type="number" name="nitrogen"><br>

<label>Phosphorus (kg/ha):</label><input type="number" name="phosphorus"><br>

<label>Potassium (kg/ha):</label><input type="number" name="potassium"><br>

<label>Soil pH:</label><input type="number" step="0.1" name="ph"><br>

<button type="submit">Predict Yield</button>

</form>

4.3.2 Output Screen

Description:
Once the user submits inputs, Flask processes the data and displays the prediction.
The output includes the predicted crop yield and a bar graph showing feature importance
values.

BACKEND CODE EXAMPLE:

@app.route('/predict', methods=['POST'])

def predict():

rainfall = float(request.form['rainfall'])

humidity = float(request.form['humidity'])

nitrogen = float(request.form['nitrogen'])

36
phosphorus = float(request.form['phosphorus'])

potassium = float(request.form['potassium'])

ph = float(request.form['ph'])

features = np.array([[rainfall, humidity, nitrogen, phosphorus, potassium, ph]])

prediction = model.predict(features)

return render_template('result.html', yield_value=prediction)

RESULT EXAMPLE:

Predicted Crop Yield: 3.56 Quintals/ha

Top Influential Features:

• Nitrogen: 0.40

• Rainfall: 0.32

• pH: 0.18

37
Chapter-5
Testing

38
5.1 Types of Testing

Testing ensures that the system performs as intended and meets both functional and non-functional
requirements.
Different types of testing were performed to verify the correct operation of both the web
application and the machine learning model used in the Predictive Model for Agriculture.

5.1.1 Unit Testing

• Each individual component or module of the system was tested separately.

• Example: Testing input validation functions, ensuring numeric values are entered, and
verifying Flask route functions.

• Tools: Python’s unittest and manual form testing.

• Result: All individual functions worked correctly.

5.1.2 Integration Testing

• Ensures proper communication between different modules like:

o Frontend (HTML/CSS)

o Flask backend

o Trained ML model (Random Forest)

• Example: Submitting values through the input form and checking if the ML model
processes and returns predictions correctly.

• Result: Data flow between modules was consistent and accurate.

5.1.3 System Testing

39
• Conducted on the complete system as a whole.

• Verified:

o Prediction accuracy

o Interface functionality

o Response time and reliability

• Result: The system performed smoothly and delivered accurate results under different
input scenarios.

5.1.4 Regression Testing

• Conducted after integrating the feature importance chart and retraining module.

• Ensured that previous working functions (form submission, output display) continued
working without errors.

• Result: System stability maintained.

5.1.5 User Acceptance Testing (UAT)

• Conducted by faculty and students to check system usability and clarity.

• Verified whether non-technical users could understand and use the web app easily.

• Result: Users successfully performed predictions and understood the output with minimal
guidance.

5.1.6 Performance Testing

• Measured system efficiency, including time to predict yield and response time of Flask
routes.
40
• Results: The model responded in under 3 seconds, and the web interface performed
smoothly without lag.

5.2 Levels of Testing

5.2.1 Component Level Testing

Each functional block, such as the Input Module, Prediction Module, and Validation Module, was
tested independently using test datasets.
This ensured that every function operated correctly before integration.

5.2.2 Integration Level Testing

After individual modules were validated, integration testing ensured smooth data exchange among
modules.
Data passed correctly from the HTML form to Flask backend and then to the ML model.

5.2.3 System Level Testing

The system was tested end-to-end — from input to output.


Inputs were provided through the web interface, predictions were generated by the model, and
results were displayed instantly.
No data mismatch or logical errors were found.

5.2.4 Acceptance Level Testing

This level verified that the final product met all specified user requirements:

• Inputs accepted correctly.

• Predictions displayed accurately.

41
• User interface simple and visually clear.
The system met all acceptance criteria successfully.

5.3 Form Testing

Form testing was carried out to validate both input and output forms of the web application.
It ensures that user inputs are valid and outputs are displayed properly.

5.3.1 Input Form Testing

• Each field (rainfall, humidity, nitrogen, phosphorus, potassium, pH) was tested for:

o Proper numeric input validation

o Detection of missing or blank fields

o Error handling messages

• When invalid data was entered, the form displayed messages prompting the user to correct
the input.

Test ID Form Field Expected Result Actual Result Status

T01 Rainfall (mm) Accept numeric input Working

T02 Humidity (%) Accept numeric input Working

T03 Soil pH Accept float values Working

T04 Empty Field Show validation error Working

T05 Submit Button Redirect to output Working

42
5.3.2 Output Form Testing

• Verified that prediction results were correctly displayed on the webpage.

• The output included:

o Predicted crop yield (in quintals/ha)

o Feature importance graph

o Explanation of top influencing features

• Layout checked for consistency across different browsers (Chrome, Edge).

Test ID Form Section Expected Result Actual Result Status

T06 Yield Display Show correct prediction Working

T07 Feature Graph Display after prediction Working

T08 Refresh Page Reset all fields Working

5.4 Evaluation Criteria (Machine Learning Model)

To evaluate the accuracy and reliability of the Random Forest Regression model, various metrics
were used.
These evaluation criteria determine how well the system predicts yield and how close the predicted
results are to the actual values.

43
5.4.1 Confusion Matrix

Since the model performs regression, a confusion matrix (used in classification) is not applicable
here.
However, the equivalent concept of error distribution was analyzed — comparing predicted vs
actual yield values to measure variance.

5.4.2 R² Score (Coefficient of Determination)

Measures how well the regression model fits the actual data.

Result:

R² = 0.91 (91%) → The model explains 91% of the variation in the yield data, showing excellent
accuracy.

5.4.3 Mean Absolute Error (MAE)

Represents the average difference between predicted and actual values.

Result:

MAE = 0.12 → Indicates very low average prediction error.

5.4.4 Root Mean Square Error (RMSE)

Represents the square root of the mean of squared errors — penalizing larger errors more heavily.

44
Result:

RMSE = 0.18 → Small error variance between predicted and actual values.

5.4.5 Precision, Recall, and F1-Score

These are standard evaluation metrics for classification problems:

• Precision → How many predicted positives are correct.

• Recall → How many actual positives are captured.

• F1-Score → Balance between precision and recall.

In regression, similar measures are represented by model sensitivity and consistency.


The model achieved high consistency, with a prediction variance under 10%.

45
Chapter-6
CONCLUSION and FUTURE WORK

46
6.1 Conclusion

The project “Predictive Model for Agriculture” was developed with the objective of assisting
farmers and agricultural researchers in predicting crop yield using soil and environmental data.
This system applies the concepts of Supervised Machine Learning, specifically the Random
Forest Regression algorithm, to analyze the relationships between multiple agricultural
parameters such as rainfall, humidity, nitrogen, phosphorus, potassium, and soil pH.

The model was successfully implemented using Python, Flask, and HTML/CSS.
The web-based interface allows users to input key agricultural factors and instantly receive the
predicted yield along with the feature importance visualization.
This enables better understanding of which variables most strongly influence productivity, such as
nutrient composition or environmental factors.

Through systematic development phases — including requirement analysis, data collection, pre-
processing, model training, and evaluation — the project achieved its goal of creating a practical,
data-driven prediction system that bridges the gap between technology and agriculture.

6.1.1 Key Achievements

1. Machine Learning Integration:


Successfully implemented a Random Forest-based regression model capable of
delivering yield predictions with 91% accuracy (R² Score = 0.91).

2. Feature Selection and Visualization:


Implemented SelectKBest to identify the most important factors influencing yield.
Displayed feature importance visually on the web interface for transparency.

3. Web Application Development:


Designed a Flask-based user interface to make the system accessible to users with
minimal technical knowledge.

47
4. Performance Optimization:
Used normalization and feature scaling to improve training accuracy and minimize model
bias.

5. Error Analysis:
Achieved low Mean Absolute Error (MAE = 0.12) and Root Mean Square Error (RMSE
= 0.18), proving reliability and precision.

6. Usability and Accessibility:


Created a lightweight, user-friendly, and responsive web interface accessible through
local or hosted environments.

6.1.2 System Effectiveness

The system effectively analyzes complex agricultural data and provides accurate crop yield
predictions.
It demonstrates the capability of data science and machine learning in solving real-world
problems in the agricultural sector.

The project helps farmers:

• Estimate potential crop yield before harvest.

• Plan resource utilization (fertilizers, irrigation, and sowing schedules).

• Make data-driven farming decisions for higher productivity.

It also aids agricultural research institutions by providing a foundational model that can be
expanded for large-scale prediction and forecasting.

6.1.3 Practical Applications

The developed system can be applied in multiple areas:

48
1. Yield Prediction:
Predicting the amount of yield expected from given environmental and soil conditions.

2. Crop Selection Assistance:


Suggesting crops suitable for specific regions based on soil parameters.

3. Fertilizer Management:
Helping determine the right fertilizer composition based on feature influence results.

4. Governmental and Research Use:


Assisting in agricultural planning, subsidy distribution, and predictive analytics for crop
insurance schemes.

6.1.4 Advantages of the System

1. High Accuracy:
Machine learning-based approach provides reliable yield estimates.

2. Cost-Effective Solution:
Utilizes open-source tools and does not require expensive sensors or manual data
collection.

3. User-Friendly Interface:
Farmers can easily input data without technical expertise.

4. Data-Driven Insight:
Feature importance charts help interpret key yield factors scientifically.

5. Scalable and Flexible:


Can be extended to new crops, regions, and datasets with minimal changes.

6.1.5 Limitations

Despite its success, the project has a few limitations:

49
1. Static Dataset:
The current system uses a pre-collected dataset and lacks real-time data integration.

2. Single Crop Scope:


The system is trained on generic yield data and does not differentiate between crop types.

3. Limited Attributes:
Environmental factors like temperature, wind speed, sunlight, and pest impact are not yet
included.

4. Deployment Restriction:
The project currently runs on localhost; cloud deployment and large-scale access are yet to be
implemented.

5. Lack of Real-Time APIs:


Integration of live weather or soil sensor data APIs is pending.

6.2 Future Work

The Predictive Model for Agriculture has great potential for future improvements and can be
expanded into a full-fledged intelligent agricultural system.
Below are some of the major areas where the project can be extended and enhanced in future
iterations.

6.2.1 Multi-Crop Prediction

Currently, the system predicts yield based on generalized data.


Future versions can include crop-specific models such as for wheat, rice, maize, etc., allowing
users to select a crop type before prediction.

6.2.2 Integration with Real-Time Data Sources

50
The model can be integrated with IoT devices and live weather APIs to collect real-time
environmental parameters.
This would help provide live yield predictions and adaptive recommendations.

6.2.3 Advanced Algorithms

The current version uses Random Forest Regressor, which performs efficiently, but future
enhancements can explore more sophisticated algorithms such as:

• Gradient Boosting Machines (GBM)

• XGBoost / LightGBM

• Artificial Neural Networks (ANNs)

These could further improve predictive accuracy and generalization ability.

6.2.4 Mobile Application Development

A mobile-based version of this system can make it easily accessible to rural farmers.
The app could include voice input in local languages and simplified prediction output for ease of
use.

6.2.5 Fertilizer and Crop Recommendation System

An additional module can be integrated to suggest:

• Optimal fertilizer ratios based on soil nutrients.

• Suitable crops based on region, rainfall, and nutrient conditions.

This would transform the model from a predictive system into a decision-support system (DSS)
for farmers.

51
6.2.6 Cloud Deployment and Scalability

Deploying the project on cloud platforms like AWS, Azure, or Google Cloud would ensure:

• Remote accessibility

• Continuous availability

• Better performance for large datasets

• Multi-user access

Additionally, continuous retraining with new data can improve the accuracy and adaptability of
the model.

6.2.7 Data Visualization Dashboards

Future work can also include creating interactive dashboards using tools like Power BI or
Tableau to visualize:

• Regional yield performance

• Environmental factor trends

• Year-over-year productivity analysis

This would make the project more useful for agricultural research institutions.

6.2.8 Integration with GIS Mapping

Integrating the system with Geographic Information Systems (GIS) can provide spatial
visualization of yield predictions.
This will help in mapping soil health and crop productivity across different locations, assisting in
regional planning and sustainable agriculture.

52
6.2.9 AI-Driven Advisory System

An AI-powered chatbot can be integrated into the platform to provide personalized agricultural
advice to farmers based on their soil data and region.
For example, it could suggest irrigation schedules, pest management tips, or fertilizer usage.

6.2.10 Multilingual Support

To make the system more accessible in rural areas, future versions should include multilingual
support for regional languages such as Hindi, Punjabi, Tamil, and Telugu.
This would allow farmers from different states to use the platform comfortably.

53
CODING AND SCREENSHOTS

54
APP.PY

55
56
57
58
HTML CODE

59
60
61
INPUT:

OUTPUT

62
REFERENCES

63
1. Géron, A. (2019). Hands-On Machine Learning with Scikit-Learn & TensorFlow.
– Used to understand machine learning algorithms and model building.

2. Scikit-Learn Documentation (2024). https://scikit-learn.org


– Referred for model training, feature selection, and evaluation functions.

3. Pandas Documentation (2024). https://pandas.pydata.org


– Used for loading, cleaning, and pre-processing the dataset.

4. Flask Official Documentation (2024). https://flask.palletsprojects.com


– Helped in creating the backend web application for prediction.

5. Kaggle – Crop Yield Prediction Dataset (2024). https://kaggle.com


– Used as a reference for dataset structure and agricultural parameters.

6. FAO – Food and Agriculture Organization Reports (2023).


– Provided general insights into crop yield factors and soil requirements.

64

You might also like