Exp_6-Model Development_sdk_ok
Exp_6-Model Development_sdk_ok
Setup
A Model will help us understand the exact relationship between different variables and how these variables are used to predict the result.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
OUTPUT:
array([16236.50464347, 16236.50464347, 17058.23802179, 13771.3045085 ,
20345.17153508])
lm.intercept_
OUTPUT :
38423.305858157386
lm.coef_
OUTPUT :
array([-821.73337832])
lm.fit(Z, df['price'])
OUTPUT :
LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None,
normalize=False)
lm.intercept_
OUTPUT :
-15806.624626329198
lm.coef_
OUTPUT :
array([53.49574423, 4.70770099, 81
EXAMPLE NO : 01
Regression Plot
OUTPUT :
plt.figure(figsize=(width, height))
sns.regplot(x="peak-rpm", y="price", data=df)
plt.ylim(0,)
plt.figure(figsize=(width, height))
sns.regplot(x="peak-rpm", y="price", data=df)
plt.ylim(0,)
OUTPUT :
(0, 47422.919330307624
df[["peak-rpm","highway-mpg","price"]].corr()
Residual Plot
width = 12
height = 10
plt.figure(figsize=(width, height))
sns.residplot(df['highway-mpg'], df['price'])
plt.show()
Multiple Linear Regression
EXAMPLE NO :03
Y_hat = lm.predict(Z)
plt.figure(figsize=(width, height))
plt.show()
plt.close()
EXAMPLE NO : 04
def PlotPolly(model, independent_variable, dependent_variabble, Name):
x_new = np.linspace(15, 55, 100)
y_new = model(x_new)
plt.show()
plt.close()
x = df['highway-mpg']
y = df['price']
f = np.polyfit(x, y, 3)
p = np.poly1d(f)
print(p)
PlotPolly(p, x, y, 'highway-mpg')
OUTPUT:
4) Measures for In-Sample Evaluation
# highway_mpg_fit
lm.fit(X, Y)
OUTPUT:
The R-square is: 0.4965911884339175
plt.plot(new_input, yhat)
plt.show()
Conclusions on Model Development:
Comparing these three models, we conclude that the MLR model is the best model to be able to predict price from our dataset. This result makes
sense, since we have 27 variables in total, and we know that more than one of those variables are potential predictors of the final car price.