Minor Projects J
Minor Projects J
1
ACKNOWLEDGEMENT
We would like to express our heartfelt gratitude to all those who have helped us in the successful
First and foremost, we would like to extend our sincere thanks to our respected guide and
mentor, [Guide’s Name], for his/her valuable guidance, constant encouragement, and continuous
support throughout the development of this project. His/her insights into Machine Learning,
Data Analysis, and Web Application Development have been invaluable in helping us
We are also deeply thankful to the Department of Computer Applications, [Your College
Name], for providing us with the necessary infrastructure, technical resources, and an excellent
Finally, we would like to express our gratitude to our family and friends for their constant
support, patience, and motivation, which have been the driving force behind the successful
BCA E1
2
SELF CERTIFICATE
This is to certify that the dissertation/project report entitled “Predictive Model for Agriculture”
is an authentic work carried out by me in partial fulfilment of the requirements for the award of
the degree of Bachelor of Computer Applications (BCA) under the guidance of Mr. Kanhaiya
Lal.
The matter embodied in this project report is based on my original work and has not been
submitted earlier for the award of any degree or diploma to the best of my knowledge and belief.
Sunveen Kaur
04324402023
3
SELF CERTIFICATE
This is to certify that the dissertation/project report entitled “Predictive Model for Agriculture”
is an authentic work carried out by me in partial fulfilment of the requirements for the award of
the degree of Bachelor of Computer Applications (BCA) under the guidance of Mr. Kanhaiya
Lal.
The matter embodied in this project report is based on my original work and has not been
submitted earlier for the award of any degree or diploma to the best of my knowledge and belief.
Jiya Basra
03824402023
4
GUIDE CERTIFICATE
This is to certify that this project entitled “Predictive Model for Agriculture” submitted in partial
fulfilment of the degree of Bachelor of Computer Applications to the “GURU GOBIND SINGH
TECHNOLOGY AND MANAGEMENT done by Ms. Sunveen Kaur and Jiya Basra , Roll No.-
04324402023 and 03824402023 is an authentic work carried out by them at under my guidance.
The matter embodied in this project work has not been submitted earlier for award of any degree
5
SYNOPSIS
6
1. Name / Title of the Project
“Predictive Model for Agriculture”
Existing agricultural systems lack intelligent tools that can analyze environmental and soil
factors to accurately predict crop productivity. This absence of predictive systems results in
inefficient resource management, uncertainty in yield, and financial instability for farmers.
Hence, there is a growing need for a machine learning–based predictive system that can process
parameters like rainfall, humidity, nitrogen, phosphorus, potassium, and soil pH to estimate the
expected crop yield. Such a model can provide valuable insights that support better planning,
fertilizer optimization, and sustainable farming practices.
With the growing accessibility of data and rapid progress in machine learning algorithms, it is
now possible to forecast yield accurately using historical and environmental data. This project
leverages supervised learning and feature selection methods to simplify prediction and deliver
actionable insights through a user-friendly interface.
7
Developing this model allows students and researchers to apply data science concepts in
agriculture, contributing to precision farming and sustainable agricultural innovation. The system
also serves as a cost-effective solution that can be expanded for large-scale deployment in
agricultural monitoring systems.
• To design and develop a machine learning model that predicts crop yield based on
environmental and soil parameters.
• To apply feature selection techniques for identifying the most influential factors affecting
yield.
• To develop a Flask-based web interface that allows users to input data and obtain
predictions instantly.
• To evaluate model performance using metrics like MAE, RMSE, and R² for improved
accuracy.
Scope:
• The model focuses on predicting yield using variables such as rainfall, humidity,
nitrogen, phosphorus, potassium, and soil pH.
• The system can be extended to handle multiple crops and integrate with IoT devices for
real-time data collection.
8
• The platform is designed to be accessible via web browsers for both farmers and
agricultural institutions.
1. Requirement Gathering:
Understanding the parameters affecting crop yield and the needs of farmers. Defining
inputs such as rainfall, humidity, nitrogen, phosphorus, potassium, and soil pH.
3. Model Development:
Implementing the Random Forest Regressor algorithm using Python’s scikit-learn library.
Applying SelectKBest feature selection to identify significant variables influencing yield.
4. Model Evaluation:
Evaluating the model using performance metrics such as MAE, RMSE, and R² score to
measure prediction accuracy.
Hardware Requirements:
9
• Processor: Intel i3 or above
Software Requirements:
• Unit Testing:
Individual functions such as data input, prediction, and retraining are tested to ensure
proper functionality.
• Integration Testing:
Verifies that the Flask web app, machine learning model, and data flow between frontend
and backend are working together smoothly.
• System Testing:
Tests the complete web application for correctness of prediction results and interface
usability.
Students and developers gain practical experience in data preprocessing, model building, and
web deployment, bridging the gap between academic learning and real-world applications.
The system sets the foundation for integrating smart agriculture technologies, encouraging future
innovations in precision farming, climate forecasting, and sustainable food production.
11
INDEX
12
Chapter 1
Introduction
13
1.1 Objectives and Scope of the Project
Agriculture is one of the most vital sectors of the Indian economy, providing employment and food
security to a large population. However, farmers often face challenges related to climate change,
soil fertility, and lack of data-driven decision-making. To overcome these challenges, there is a
need for intelligent systems that can assist in predicting crop yield and optimizing agricultural
resources.
The main objective of this project, “Predictive Model for Agriculture,” is to design and develop a
machine learning–based system that can predict crop yield based on soil and environmental
parameters such as rainfall, humidity, nitrogen, phosphorus, potassium, and soil pH.
By analyzing these parameters, the system aims to guide farmers in making informed decisions
related to fertilizer management, irrigation planning, and crop selection.
1. To develop a predictive model using Machine Learning (Random Forest Algorithm) for
crop yield estimation.
5. To demonstrate how Artificial Intelligence can support sustainable farming practices and
efficient resource utilization.
• Developing a web-based system that predicts crop yield using agricultural parameters.
14
• Providing feature importance visualization to show which inputs have the highest impact
on yield.
• Allowing farmers, students, and researchers to use the system for educational or practical
purposes.
• The system can be enhanced with IoT integration for real-time data collection and cloud
deployment for scalability.
• Future developments can include fertilizer recommendations, crop disease prediction, and
weather forecasting integration.
In conclusion, the scope of the project is to bridge the gap between technology and agriculture,
creating a smart and accessible tool that benefits both rural farmers and agricultural analysts.
Machine Learning (ML) is a branch of Artificial Intelligence that enables systems to learn from
data and improve performance without explicit programming. In the context of agriculture, ML
models analyze patterns within environmental and soil data to predict crop yield and other vital
outcomes.
The use of ML in agriculture has grown significantly due to its ability to handle large datasets and
uncover hidden patterns. Algorithms like Linear Regression, Random Forest, and Neural Networks
have been applied to predict yield, optimize irrigation schedules, and detect plant diseases.
The Random Forest Regressor, used in this project, is an ensemble learning algorithm that builds
multiple decision trees and averages their predictions to improve accuracy and prevent overfitting.
It is ideal for agricultural applications due to its robustness and ability to manage complex data
relationships.
15
Feature Selection
Feature selection is a process of identifying the most relevant variables that influence the output.
In this project, the SelectKBest method is used to find which environmental and soil parameters
have the greatest impact on crop yield.
This step improves model performance and interpretability by focusing only on the key features
such as rainfall, nitrogen, and soil pH.
Web-Based Implementation
The predictive system is implemented using Python’s Flask framework, which connects the trained
ML model to a front-end web interface.
Users can enter their data into a web form, and the system instantly returns the predicted yield.
This approach makes the model accessible, interactive, and easy to use for farmers and agricultural
institutions.
Overall, the theoretical background of this project combines the fields of data science, machine
learning, and web development to create a practical, AI-driven agricultural tool.
Agriculture today faces increasing pressure due to unpredictable weather, limited resources, and
growing demand. Traditional methods of yield prediction rely on manual observation or outdated
data, which are often inaccurate and unreliable.
Farmers lack accessible, intelligent tools to help them estimate potential yield based on soil and
environmental conditions.
The problem this project addresses is the absence of a smart, automated system that can analyze
multiple agricultural parameters and provide accurate yield predictions. Without such a system:
16
The Predictive Model for Agriculture solves this problem by using supervised machine learning
techniques to predict yield, identify the most influential factors, and present results through a
simple web interface.
This approach empowers farmers with actionable insights, reduces uncertainty, and promotes
sustainable farming through the use of technology.
17
Chapter 2
18
2.1 SDLC Model Used
Software Development Life Cycle (SDLC) provides a systematic process to design, develop, and
For this project, the Waterfall Model was selected as the most appropriate SDLC approach.
The Waterfall Model is a linear and sequential model where each phase must be completed
before the next begins. It ensures proper documentation, disciplined development, and clear
deliverables — ideal for academic projects with well-defined requirements like this one.
1. Requirement Analysis:
2. System Design:
3. Implementation:
4. Testing:
19
o Perform unit, integration, and system testing to verify accuracy and usability.
5. Deployment:
o Prepare the model for cloud deployment on Heroku or AWS for scalability.
6. Maintenance:
The Waterfall model ensures that each stage of the Predictive Model for Agriculture is developed
deployment.
20
The Program Evaluation and Review Technique (PERT) is used to plan, schedule, and control
project activities.
It provides a graphical representation of tasks and their dependencies, ensuring that the project is
The PERT chart helps in identifying critical tasks, sequencing work, and estimating total time
1 Requirement 3 Understanding
system goals and user
Analysis needs.
21
The PERT chart helped in efficient scheduling and timely completion of each development
stage.
1. User Input:
Users can enter agricultural parameters such as rainfall (mm), humidity (%), nitrogen,
phosphorus, potassium (kg/ha), and soil pH into the system.
2. Data Validation:
The system checks that all entered values are numeric and within valid ranges before
processing.
5. Retraining Model:
Users can retrain the machine learning model using new datasets to improve accuracy.
6. Result Visualization:
The prediction result is displayed clearly on the web page, supported by small graphs or
bars indicating the importance of each feature.
7. Error Handling:
If any field is left blank or contains invalid input, the system will display an appropriate
error message.
22
8. Web Interface:
A clean, interactive web interface is created using Flask, HTML, and CSS, allowing
users to easily input data and view predictions.
Non-functional requirements describe how the system performs rather than what it does.
They define quality attributes such as speed, usability, reliability, and scalability.
1. Performance:
The system should provide fast and accurate predictions (within 3–5 seconds per
request).
2. Usability:
The application must be simple, clean, and user-friendly, suitable for both students and
agricultural professionals.
3. Reliability:
The prediction results should remain consistent for the same input data, ensuring model
stability.
4. Scalability:
The system should support larger datasets and new crop types in the future without major
code changes.
5. Security:
Input data is handled securely; no sensitive information is stored or transmitted
externally.
23
6. Maintainability:
The project’s code structure is modular and well-documented, allowing easy updates,
debugging, and retraining.
7. Portability:
The web application should run smoothly on different systems (Windows 10/11) and
browsers (Chrome, Edge, Firefox).
8. Availability:
When hosted locally or on a cloud platform, the system should be available for use
anytime with minimal downtime.
9. Accuracy:
The predictive model must maintain an accuracy level of at least 85–90% for most test
cases.
Hardware Requirements
Component Specification
RAM Minimum 4 GB
24
Software Requirements
Software Description
25
Chapter 3
System Design
26
3.1 Block Diagram
The block diagram shows the overall flow of the system. The user enters inputs, the backend
processes them, the ML model predicts the yield, and the result is displayed.
27
3.3 DFD Level 0
DFD Level 0 shows the whole system as a single process interacting with the user.
28
DFD Level 1
DFD Level 1 breaks the system into smaller sub-processes: validation, preprocessing, prediction,
and output generation.
29
3.6 Use Case Diagram
The use case diagram shows how the user interacts with the system and what tasks they can
perform.
30
31
Chapter 4
Software Development
32
4.1 Data Collection
The system uses datasets containing parameters such as rainfall, humidity, nitrogen, phosphorus,
potassium, soil pH, and crop yield.
Since real agricultural data is not easily accessible for small-scale academic projects, both
synthetic data generation and public datasets were used.
1. Synthetic Dataset:
2. Public Datasets:
o Weather data from OpenWeatherMap API for rainfall and humidity patterns.
33
Each record represents environmental and soil conditions for a given crop cycle and its resulting
yield.
1. Data Cleaning:
3. Feature Scaling:
34
o Normalized the data using MinMaxScaler() from scikit-learn.
o Ensures that large variations (e.g., rainfall vs. pH) don’t dominate the model.
4. Feature Selection:
o The most impactful features identified were Nitrogen, Rainfall, and pH.
o The dataset was divided into 80% training and 20% testing sets using
train_test_split().
Input and Output screens define how the user interacts with the system.
This system provides a simple, interactive, and clean web interface created using Flask, HTML,
and CSS.
Description:
The input screen allows users to provide necessary agricultural parameters for prediction.
Each field has validation checks to ensure only valid numeric values are entered.
Input Fields:
• Rainfall (mm)
• Humidity (%)
• Nitrogen (kg/ha)
• Phosphorus (kg/ha)
• Potassium (kg/ha)
35
• Soil pH
</form>
Description:
Once the user submits inputs, Flask processes the data and displays the prediction.
The output includes the predicted crop yield and a bar graph showing feature importance
values.
@app.route('/predict', methods=['POST'])
def predict():
rainfall = float(request.form['rainfall'])
humidity = float(request.form['humidity'])
nitrogen = float(request.form['nitrogen'])
36
phosphorus = float(request.form['phosphorus'])
potassium = float(request.form['potassium'])
ph = float(request.form['ph'])
prediction = model.predict(features)
RESULT EXAMPLE:
• Nitrogen: 0.40
• Rainfall: 0.32
• pH: 0.18
37
Chapter-5
Testing
38
5.1 Types of Testing
Testing ensures that the system performs as intended and meets both functional and non-functional
requirements.
Different types of testing were performed to verify the correct operation of both the web
application and the machine learning model used in the Predictive Model for Agriculture.
• Example: Testing input validation functions, ensuring numeric values are entered, and
verifying Flask route functions.
o Frontend (HTML/CSS)
o Flask backend
• Example: Submitting values through the input form and checking if the ML model
processes and returns predictions correctly.
39
• Conducted on the complete system as a whole.
• Verified:
o Prediction accuracy
o Interface functionality
• Result: The system performed smoothly and delivered accurate results under different
input scenarios.
• Conducted after integrating the feature importance chart and retraining module.
• Ensured that previous working functions (form submission, output display) continued
working without errors.
• Verified whether non-technical users could understand and use the web app easily.
• Result: Users successfully performed predictions and understood the output with minimal
guidance.
• Measured system efficiency, including time to predict yield and response time of Flask
routes.
40
• Results: The model responded in under 3 seconds, and the web interface performed
smoothly without lag.
Each functional block, such as the Input Module, Prediction Module, and Validation Module, was
tested independently using test datasets.
This ensured that every function operated correctly before integration.
After individual modules were validated, integration testing ensured smooth data exchange among
modules.
Data passed correctly from the HTML form to Flask backend and then to the ML model.
This level verified that the final product met all specified user requirements:
41
• User interface simple and visually clear.
The system met all acceptance criteria successfully.
Form testing was carried out to validate both input and output forms of the web application.
It ensures that user inputs are valid and outputs are displayed properly.
• Each field (rainfall, humidity, nitrogen, phosphorus, potassium, pH) was tested for:
• When invalid data was entered, the form displayed messages prompting the user to correct
the input.
42
5.3.2 Output Form Testing
To evaluate the accuracy and reliability of the Random Forest Regression model, various metrics
were used.
These evaluation criteria determine how well the system predicts yield and how close the predicted
results are to the actual values.
43
5.4.1 Confusion Matrix
Since the model performs regression, a confusion matrix (used in classification) is not applicable
here.
However, the equivalent concept of error distribution was analyzed — comparing predicted vs
actual yield values to measure variance.
Measures how well the regression model fits the actual data.
Result:
R² = 0.91 (91%) → The model explains 91% of the variation in the yield data, showing excellent
accuracy.
Result:
Represents the square root of the mean of squared errors — penalizing larger errors more heavily.
44
Result:
RMSE = 0.18 → Small error variance between predicted and actual values.
45
Chapter-6
CONCLUSION and FUTURE WORK
46
6.1 Conclusion
The project “Predictive Model for Agriculture” was developed with the objective of assisting
farmers and agricultural researchers in predicting crop yield using soil and environmental data.
This system applies the concepts of Supervised Machine Learning, specifically the Random
Forest Regression algorithm, to analyze the relationships between multiple agricultural
parameters such as rainfall, humidity, nitrogen, phosphorus, potassium, and soil pH.
The model was successfully implemented using Python, Flask, and HTML/CSS.
The web-based interface allows users to input key agricultural factors and instantly receive the
predicted yield along with the feature importance visualization.
This enables better understanding of which variables most strongly influence productivity, such as
nutrient composition or environmental factors.
Through systematic development phases — including requirement analysis, data collection, pre-
processing, model training, and evaluation — the project achieved its goal of creating a practical,
data-driven prediction system that bridges the gap between technology and agriculture.
47
4. Performance Optimization:
Used normalization and feature scaling to improve training accuracy and minimize model
bias.
5. Error Analysis:
Achieved low Mean Absolute Error (MAE = 0.12) and Root Mean Square Error (RMSE
= 0.18), proving reliability and precision.
The system effectively analyzes complex agricultural data and provides accurate crop yield
predictions.
It demonstrates the capability of data science and machine learning in solving real-world
problems in the agricultural sector.
It also aids agricultural research institutions by providing a foundational model that can be
expanded for large-scale prediction and forecasting.
48
1. Yield Prediction:
Predicting the amount of yield expected from given environmental and soil conditions.
3. Fertilizer Management:
Helping determine the right fertilizer composition based on feature influence results.
1. High Accuracy:
Machine learning-based approach provides reliable yield estimates.
2. Cost-Effective Solution:
Utilizes open-source tools and does not require expensive sensors or manual data
collection.
3. User-Friendly Interface:
Farmers can easily input data without technical expertise.
4. Data-Driven Insight:
Feature importance charts help interpret key yield factors scientifically.
6.1.5 Limitations
49
1. Static Dataset:
The current system uses a pre-collected dataset and lacks real-time data integration.
3. Limited Attributes:
Environmental factors like temperature, wind speed, sunlight, and pest impact are not yet
included.
4. Deployment Restriction:
The project currently runs on localhost; cloud deployment and large-scale access are yet to be
implemented.
The Predictive Model for Agriculture has great potential for future improvements and can be
expanded into a full-fledged intelligent agricultural system.
Below are some of the major areas where the project can be extended and enhanced in future
iterations.
50
The model can be integrated with IoT devices and live weather APIs to collect real-time
environmental parameters.
This would help provide live yield predictions and adaptive recommendations.
The current version uses Random Forest Regressor, which performs efficiently, but future
enhancements can explore more sophisticated algorithms such as:
• XGBoost / LightGBM
A mobile-based version of this system can make it easily accessible to rural farmers.
The app could include voice input in local languages and simplified prediction output for ease of
use.
This would transform the model from a predictive system into a decision-support system (DSS)
for farmers.
51
6.2.6 Cloud Deployment and Scalability
Deploying the project on cloud platforms like AWS, Azure, or Google Cloud would ensure:
• Remote accessibility
• Continuous availability
• Multi-user access
Additionally, continuous retraining with new data can improve the accuracy and adaptability of
the model.
Future work can also include creating interactive dashboards using tools like Power BI or
Tableau to visualize:
This would make the project more useful for agricultural research institutions.
Integrating the system with Geographic Information Systems (GIS) can provide spatial
visualization of yield predictions.
This will help in mapping soil health and crop productivity across different locations, assisting in
regional planning and sustainable agriculture.
52
6.2.9 AI-Driven Advisory System
An AI-powered chatbot can be integrated into the platform to provide personalized agricultural
advice to farmers based on their soil data and region.
For example, it could suggest irrigation schedules, pest management tips, or fertilizer usage.
To make the system more accessible in rural areas, future versions should include multilingual
support for regional languages such as Hindi, Punjabi, Tamil, and Telugu.
This would allow farmers from different states to use the platform comfortably.
53
CODING AND SCREENSHOTS
54
APP.PY
55
56
57
58
HTML CODE
59
60
61
INPUT:
OUTPUT
62
REFERENCES
63
1. Géron, A. (2019). Hands-On Machine Learning with Scikit-Learn & TensorFlow.
– Used to understand machine learning algorithms and model building.
64