0% found this document useful (0 votes)

15 views1 page

Linear Regression Test

The document details a data preprocessing and analysis workflow for a dataset on food delivery times, including data import, cleaning, and visualization. It employs linear regression to model the relationship between delivery distance and delivery time, achieving a moderate R-squared value of approximately 0.65. The analysis includes handling missing values, encoding categorical variables, and evaluating model performance using metrics such as mean squared error and mean absolute error.

Uploaded by

thayu5105

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views1 page

Linear Regression Test

Uploaded by

thayu5105

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

Data preprocessing

In [11]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error,r2_score
f=pd.read_csv("Food_Delivery_Times.csv")
f

Out[11]: Order_ID Distance_km Weather Traffic_Level Time_of_Day Vehicle_Type Preparation_Time_min Courier_Experience_yrs Delivery_Time_min

0 522 7.93 Windy Low Afternoon Scooter 12 1.0 43

1 738 16.42 Clear Medium Evening Bike 20 2.0 84

2 741 9.52 Foggy Low Night Scooter 28 1.0 59

3 661 7.44 Rainy Medium Afternoon Scooter 5 1.0 37

4 412 19.03 Clear Low Morning Bike 16 5.0 68

... ... ... ... ... ... ... ... ... ...

995 107 8.50 Clear High Evening Car 13 3.0 54

996 271 16.28 Rainy Low Morning Scooter 8 9.0 71

997 861 15.62 Snowy High Evening Scooter 26 2.0 81

998 436 14.17 Clear Low Afternoon Bike 8 0.0 55

999 103 6.63 Foggy Low Night Scooter 24 3.0 58

1000 rows × 9 columns

In [12]:
f.head()

Out[12]: Order_ID Distance_km Weather Traffic_Level Time_of_Day Vehicle_Type Preparation_Time_min Courier_Experience_yrs Delivery_Time_min

0 522 7.93 Windy Low Afternoon Scooter 12 1.0 43

1 738 16.42 Clear Medium Evening Bike 20 2.0 84

2 741 9.52 Foggy Low Night Scooter 28 1.0 59

3 661 7.44 Rainy Medium Afternoon Scooter 5 1.0 37

4 412 19.03 Clear Low Morning Bike 16 5.0 68

In [13]:
f.describe()

Out[13]: Order_ID Distance_km Preparation_Time_min Courier_Experience_yrs Delivery_Time_min

count 1000.000000 1000.000000 1000.000000 970.000000 1000.000000

mean 500.500000 10.059970 16.982000 4.579381 56.732000

std 288.819436 5.696656 7.204553 2.914394 22.070915

min 1.000000 0.590000 5.000000 0.000000 8.000000

25% 250.750000 5.105000 11.000000 2.000000 41.000000

50% 500.500000 10.190000 17.000000 5.000000 55.500000

75% 750.250000 15.017500 23.000000 7.000000 71.000000

max 1000.000000 19.990000 29.000000 9.000000 153.000000

In [25]:
f.isnull().sum()

Out[25]: Order_ID 0
Distance_km 0
Weather 0
Traffic_Level 0
Time_of_Day 0
Vehicle_Type 0
Preparation_Time_min 0
Courier_Experience_yrs 0
Delivery_Time_min 0
dtype: int64

In [15]:
f.duplicated().sum()

Out[15]: 0

In [16]:
f.shape

Out[16]: (1000, 9)

In [17]:
f.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 9 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Order_ID 1000 non-null int64
1 Distance_km 1000 non-null float64
2 Weather 970 non-null object
3 Traffic_Level 970 non-null object
4 Time_of_Day 970 non-null object
5 Vehicle_Type 1000 non-null object
6 Preparation_Time_min 1000 non-null int64
7 Courier_Experience_yrs 970 non-null float64
8 Delivery_Time_min 1000 non-null int64
dtypes: float64(2), int64(3), object(4)
memory usage: 70.4+ KB

In [22]:
from sklearn.preprocessing import LabelEncoder
f["Weather"]=LabelEncoder().fit_transform(f["Weather"])
f["Traffic_Level"]=LabelEncoder().fit_transform(f["Traffic_Level"])
f["Time_of_Day"]=LabelEncoder().fit_transform(f["Time_of_Day"])
f["Vehicle_Type"]=LabelEncoder().fit_transform(f["Vehicle_Type"])

In [24]:
f["Courier_Experience_yrs"].fillna(f["Courier_Experience_yrs"].mean(),inplace=True)

visualization
In [40]:
sns.scatterplot(x="Distance_km",y="Delivery_Time_min",data=f)
plt.title("scatter plot of hours studied vs exam score")
plt.xlabel("DISTANCE(KM)")
plt.ylabel("DELIVERY TIME")
plt.show()

In [26]:
correlation=f.corr()
print("correlation Matrix:")
print(correlation)

correlation Matrix:
Order_ID Distance_km Weather Traffic_Level \
Order_ID 1.000000 -0.024483 -0.035785 -0.050845
Distance_km -0.024483 1.000000 0.029756 -0.036602
Weather -0.035785 0.029756 1.000000 -0.031301
Traffic_Level -0.050845 -0.036602 -0.031301 1.000000
Time_of_Day -0.027034 0.009034 0.006595 0.022550
Vehicle_Type -0.045030 0.003319 -0.019231 0.032593
Preparation_Time_min -0.035100 -0.009037 -0.039429 0.004945
Courier_Experience_yrs 0.012933 -0.007713 0.037972 -0.037431
Delivery_Time_min -0.036650 0.780998 0.110254 -0.087523

Time_of_Day Vehicle_Type Preparation_Time_min \

Order_ID -0.027034 -0.045030 -0.035100
Distance_km 0.009034 0.003319 -0.009037
Weather 0.006595 -0.019231 -0.039429
Traffic_Level 0.022550 0.032593 0.004945
Time_of_Day 1.000000 -0.054988 0.004867
Vehicle_Type -0.054988 1.000000 0.020707
Preparation_Time_min 0.004867 0.020707 1.000000
Courier_Experience_yrs -0.057384 -0.002504 -0.030353
Delivery_Time_min 0.025133 -0.006629 0.307350

Courier_Experience_yrs Delivery_Time_min
Order_ID 0.012933 -0.036650
Distance_km -0.007713 0.780998
Weather 0.037972 0.110254
Traffic_Level -0.037431 -0.087523
Time_of_Day -0.057384 0.025133
Vehicle_Type -0.002504 -0.006629
Preparation_Time_min -0.030353 0.307350
Courier_Experience_yrs 1.000000 -0.089066
Delivery_Time_min -0.089066 1.000000

In [28]:
x=f[["Distance_km"]]
y=f["Delivery_Time_min"]
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2,random_state=42)

In [29]:
model=LinearRegression()
model.fit(x_train,y_train)

Out[29]: LinearRegression()

In [30]:
print("\nModel coeeficient:")
print(f"Intercept:{model.intercept_}")
print(f"slope:{model.coef_[0]}")

Model coeeficient:
Intercept:26.585748176869693
slope:3.0164695806793005

In [31]:
y_pred=model.predict(x_test)

Evaluation metrics and interpretation

In [38]:
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

mse = mean_squared_error(y_test, y_pred)

r2 = r2_score(y_test, y_pred)
mae = mean_absolute_error(y_test, y_pred)
print("\nModel Evaluation:")
print(f"Mean Squared Error (MSE): {mse}")
print(f"R-squared (R2): {r2}")
print(f"Mean Absolute Error (MAE): {mae}")

Model Evaluation:
Mean Squared Error (MSE): 158.16196727280166
R-squared (R2): 0.6471386683659509
Mean Absolute Error (MAE): 9.580954148917813

INTERPRETING R2
In [42]:
if r2==1:
print("THE MODEL PERFECTLY EXPLAINS THE VARIANCE IN SALES REVENUE")
elif r2>0.5:
print("THE MODEL EXPLAINS A MODERATE PROPORTION OF THE VARIANCE IN SALES REVENUE")
elif r2>0.8:
print("THE MODEL EXPLAINS A STRONG PROPORTION OF THE VARIANCE IN SALES REVENUE")
elif r2>0.5:
print("THE MODEL EXPLAINS A MODERATE PROPORTION OF THE VARIANCE IN SALES REVENUE")
else:
print("THE MODEL EXPLAINS A WEAK PROPORTION OF THE VARIANCE IN SALES REVENUE")

THE MODEL EXPLAINS A MODERATE PROPORTION OF THE VARIANCE IN SALES REVENUE

In [33]:
plt.scatter(x, y, color="blue", label="Data Points")
plt.plot(x, model.predict(x), color="red", label="Regression Line")
plt.title("Simple Linear Regression: Hours Studied vs Exam Score")
plt.xlabel("Hours Studied")
plt.ylabel("Exam Score")
plt.legend()
plt.show()

In [ ]:

Taxi Trips Analysis Project 1682332303
100% (2)
Taxi Trips Analysis Project 1682332303
28 pages
Food Delivery Time Prediction 1703681339
100% (1)
Food Delivery Time Prediction 1703681339
8 pages
Swiggy Case Study - DE Reject
0% (1)
Swiggy Case Study - DE Reject
2 pages
Project Template Notebook Ipynb 1
No ratings yet
Project Template Notebook Ipynb 1
23 pages
Sudan Main Road
No ratings yet
Sudan Main Road
449 pages
Preprocessing Data For Machine Learning: Sarah Guido
No ratings yet
Preprocessing Data For Machine Learning: Sarah Guido
21 pages
Red Dragon Solutions
No ratings yet
Red Dragon Solutions
25 pages
porter-case-study
No ratings yet
porter-case-study
153 pages
Dissertation-Time at Door
No ratings yet
Dissertation-Time at Door
37 pages
eSSL Time Attendance and Payroll Management Help Manual
No ratings yet
eSSL Time Attendance and Payroll Management Help Manual
9 pages
LPC17xx PWM.C Library
0% (1)
LPC17xx PWM.C Library
4 pages
2425 English 1º Bachillerato Unit 7_compressed
No ratings yet
2425 English 1º Bachillerato Unit 7_compressed
31 pages
PYF_Project_LearnerNotebook_LowCode
No ratings yet
PYF_Project_LearnerNotebook_LowCode
6 pages
India Automotive Sc (1)
No ratings yet
India Automotive Sc (1)
85 pages
Lab Project
No ratings yet
Lab Project
19 pages
Butler Highway
No ratings yet
Butler Highway
16 pages
Supervised Regression
No ratings yet
Supervised Regression
24 pages
ml_code_output
No ratings yet
ml_code_output
38 pages
Delhivery Feature Engineering Cs
No ratings yet
Delhivery Feature Engineering Cs
46 pages
3.1 Business Understanding
No ratings yet
3.1 Business Understanding
15 pages
2.3 User Classes and Characteristics: Class Drone
No ratings yet
2.3 User Classes and Characteristics: Class Drone
4 pages
Docslide - Us - Stages of Management Consulting Engagement Part II
No ratings yet
Docslide - Us - Stages of Management Consulting Engagement Part II
17 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
5 pages
Delhivery_Case_Study_compressed
No ratings yet
Delhivery_Case_Study_compressed
31 pages
Pt1 Xii Cs Complete in Word
No ratings yet
Pt1 Xii Cs Complete in Word
6 pages
Year End Close Steps
100% (1)
Year End Close Steps
3 pages
Food Delivery Time Prediction With LSTM Neural Network
No ratings yet
Food Delivery Time Prediction With LSTM Neural Network
7 pages
O Ring GB en
No ratings yet
O Ring GB en
226 pages
user-manual-27337_(manymanuals.com)-1-15
No ratings yet
user-manual-27337_(manymanuals.com)-1-15
15 pages
tkmt2
No ratings yet
tkmt2
12 pages
report
No ratings yet
report
25 pages
Prisonbreaker 1.5 by Jake11price
0% (1)
Prisonbreaker 1.5 by Jake11price
118 pages
DIV_4_310
No ratings yet
DIV_4_310
26 pages
Lec 2
No ratings yet
Lec 2
28 pages
Geakmindz Test.ipynb - Colab
No ratings yet
Geakmindz Test.ipynb - Colab
8 pages
Classification Model To Classify Network Traffic
No ratings yet
Classification Model To Classify Network Traffic
5 pages
2016MIS013
No ratings yet
2016MIS013
36 pages
Quick Simulation-VehicleTrips-08-10-00.000
No ratings yet
Quick Simulation-VehicleTrips-08-10-00.000
11 pages
Project Presentation
No ratings yet
Project Presentation
14 pages
Dse
No ratings yet
Dse
2 pages
vertopal.com_Delhivery
No ratings yet
vertopal.com_Delhivery
20 pages
Cac2 22112338
No ratings yet
Cac2 22112338
50 pages
18 Dataset Description Project One
No ratings yet
18 Dataset Description Project One
1 page
Nilay Debnath CSE 06607735
No ratings yet
Nilay Debnath CSE 06607735
22 pages
ML Project - Jupyter Notebook
No ratings yet
ML Project - Jupyter Notebook
5 pages
assign
No ratings yet
assign
2 pages
ML All Prints
No ratings yet
ML All Prints
25 pages
Chapter 8 - Marketing Promotional Mix
No ratings yet
Chapter 8 - Marketing Promotional Mix
82 pages
Guía de Usuario RTMS Modelo ECHO
No ratings yet
Guía de Usuario RTMS Modelo ECHO
90 pages
Using decision tree classifiers to predict shipping times (1)
No ratings yet
Using decision tree classifiers to predict shipping times (1)
20 pages
Bike Sharing Prediction Project Structure
No ratings yet
Bike Sharing Prediction Project Structure
37 pages
A Transformer-Based Approach For Source Code Summarization: Former
No ratings yet
A Transformer-Based Approach For Source Code Summarization: Former
10 pages
RFM - Analysis - Ipynb - Colaboratory
No ratings yet
RFM - Analysis - Ipynb - Colaboratory
10 pages
Unit 1a Awareness of Cyber Crimes and Security
No ratings yet
Unit 1a Awareness of Cyber Crimes and Security
32 pages
Case Study v0
No ratings yet
Case Study v0
3 pages
Taxi Fare Team 09
No ratings yet
Taxi Fare Team 09
25 pages
Assignment#1: Comsats University Islamabad (Vehari)
No ratings yet
Assignment#1: Comsats University Islamabad (Vehari)
11 pages
Im 305 Course Outline
100% (1)
Im 305 Course Outline
4 pages
About The Dataset - Car Evaluation Dataset (UCI Machine Learning Repository
No ratings yet
About The Dataset - Car Evaluation Dataset (UCI Machine Learning Repository
5 pages
PDF Data Volume Management - Best Practice 5000000006277
No ratings yet
PDF Data Volume Management - Best Practice 5000000006277
29 pages
data_description
No ratings yet
data_description
3 pages
MDRPInstances
No ratings yet
MDRPInstances
6 pages
Nagareddy 18-Nov-2023
No ratings yet
Nagareddy 18-Nov-2023
20 pages
Dab400 Dalvir Singh (0855812)
No ratings yet
Dab400 Dalvir Singh (0855812)
3 pages
Line Drawing
No ratings yet
Line Drawing
19 pages
C Practical File
No ratings yet
C Practical File
34 pages
GRL - EX - 4 (1) .Ipynb - Colaboratory
No ratings yet
GRL - EX - 4 (1) .Ipynb - Colaboratory
7 pages
To Do List Abstract
No ratings yet
To Do List Abstract
4 pages
E-commerce Shipping Analysis - Project Report
No ratings yet
E-commerce Shipping Analysis - Project Report
12 pages
E-Commerce Product Delivery Prediction
No ratings yet
E-Commerce Product Delivery Prediction
13 pages
PDF Succinctly
No ratings yet
PDF Succinctly
4 pages
f8
No ratings yet
f8
2 pages
SMDM Guided Project Ashish
No ratings yet
SMDM Guided Project Ashish
25 pages
Deep Learning Assignments
No ratings yet
Deep Learning Assignments
13 pages
3D MAX Course PDF Details Syllabus Fees
No ratings yet
3D MAX Course PDF Details Syllabus Fees
7 pages
Phase 3 (2)
No ratings yet
Phase 3 (2)
19 pages
Delhivery Feature Engineering Case study
No ratings yet
Delhivery Feature Engineering Case study
1 page
Optimizing Delivery Routes
No ratings yet
Optimizing Delivery Routes
12 pages
Marketing Analytics Assignment 1
No ratings yet
Marketing Analytics Assignment 1
6 pages
Main Phase 3 Dharani (1)
No ratings yet
Main Phase 3 Dharani (1)
19 pages
Lzma File Format
No ratings yet
Lzma File Format
3 pages
chatGPT Plugins
100% (2)
chatGPT Plugins
50 pages
Pin Diagram of 8086
No ratings yet
Pin Diagram of 8086
35 pages
Flask Admin
No ratings yet
Flask Admin
100 pages
Mini Project (BDA) Output
No ratings yet
Mini Project (BDA) Output
5 pages
Red Hat Certified System Administrator Study Guide
No ratings yet
Red Hat Certified System Administrator Study Guide
15 pages
Excel Project Timeline
No ratings yet
Excel Project Timeline
10 pages
Implementing Basic Authentication With Spring Security
No ratings yet
Implementing Basic Authentication With Spring Security
2 pages
Business Communication Course Syllabus 2019-2020
No ratings yet
Business Communication Course Syllabus 2019-2020
5 pages

Linear Regression Test

Uploaded by

Linear Regression Test

Uploaded by

Data preprocessing

0 522 7.93 Windy Low Afternoon Scooter 12 1.0 43

1 738 16.42 Clear Medium Evening Bike 20 2.0 84

2 741 9.52 Foggy Low Night Scooter 28 1.0 59

3 661 7.44 Rainy Medium Afternoon Scooter 5 1.0 37

4 412 19.03 Clear Low Morning Bike 16 5.0 68

995 107 8.50 Clear High Evening Car 13 3.0 54

996 271 16.28 Rainy Low Morning Scooter 8 9.0 71

997 861 15.62 Snowy High Evening Scooter 26 2.0 81

998 436 14.17 Clear Low Afternoon Bike 8 0.0 55

999 103 6.63 Foggy Low Night Scooter 24 3.0 58

1000 rows × 9 columns

0 522 7.93 Windy Low Afternoon Scooter 12 1.0 43

1 738 16.42 Clear Medium Evening Bike 20 2.0 84

2 741 9.52 Foggy Low Night Scooter 28 1.0 59

3 661 7.44 Rainy Medium Afternoon Scooter 5 1.0 37

4 412 19.03 Clear Low Morning Bike 16 5.0 68

Out[13]: Order_ID Distance_km Preparation_Time_min Courier_Experience_yrs Delivery_Time_min

count 1000.000000 1000.000000 1000.000000 970.000000 1000.000000

mean 500.500000 10.059970 16.982000 4.579381 56.732000

std 288.819436 5.696656 7.204553 2.914394 22.070915

min 1.000000 0.590000 5.000000 0.000000 8.000000

25% 250.750000 5.105000 11.000000 2.000000 41.000000

50% 500.500000 10.190000 17.000000 5.000000 55.500000

75% 750.250000 15.017500 23.000000 7.000000 71.000000

max 1000.000000 19.990000 29.000000 9.000000 153.000000

Time_of_Day Vehicle_Type Preparation_Time_min \

Evaluation metrics and interpretation

mse = mean_squared_error(y_test, y_pred)

THE MODEL EXPLAINS A MODERATE PROPORTION OF THE VARIANCE IN SALES REVENUE

You might also like