0% found this document useful (0 votes)
17 views68 pages

Day 5. Product Price Prediction Using DataRobot

Uploaded by

somali channeL
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views68 pages

Day 5. Product Price Prediction Using DataRobot

Uploaded by

somali channeL
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 68

10 DAYS

NO CODE AI/ML
CHALLENGE

DAY 5
TASK 1
PROJECT CARD
AND DEMO

DAY 5
EASY ADVANCED
PROJECT CARD
GOAL:
• Build, train, test and deploy a machine learning model
to predict used car prices based on their features
5
TOOL:
• DataRobot

PRACTICAL REAL-WORLD APPLICATION:


• This project can be effectively used by car dealerships
to predict used car prices and understand key factors
that contribute to used car prices.

DATA:
• INPUTS:
o Make, Model, Type, Origin, Drivetrain, Invoice,
EngineSize, Cylinders, Horsepower, MPG_City,
MPG_Highway, Weight, Wheelbase, and Length

• OUTPUT:
o MSRP (Price) Image
ImageSource: https://www.flickr.com/photos/pasa/6757993805
Source: https://www.flickr.com/photos/pasa/6757993805
Dataset Source: https://www.kaggle.com/ljanjughazyan/cars1
Dataset Source: https://www.kaggle.com/ljanjughazyan/cars1
PROJECT DEMO

5
TASK 2
SUCCESS
STORIES

DAY 5
EASY ADVANCED
SUCCESS STORIES
• Price prediction of products and services is critical for any company to
maximize revenues and reduce costs.
• Fareboom.com is an innovative tool that leverages machine learning to
5
predict flight prices. The tool has been developed by AltexSoft.
• The fare forecast feature has been developed to help users make better
purchasing decisions.
• The tool can guide customers to select the best time to purchase a flight.
• The tool is built on a self learning machine learning algorithm that can
predict future price movements while taking into account historical data,
airlines deals, demand, and seasonal effects.
• Great case studies: https://www.altexsoft.com/case-studies/
• Fare price prediction tool:
https://www.altexsoft.com/case-studies/travel/altexsoft-creates-unique-data
-science-and-analytics-based-fare-predictor-tool-to-forecast-price-movemen
ts/

Source: https://www.altexsoft.com/blog/datascience/data-science-and-ai-in-the-travel-industry-9-real-life-use-cases/
READING TIME & QUIZ: AI/ML APPLICATIONS
IN PRICE FORECASTING
• Please read the article below and answer the following quiz.
o Link to Article:
https://www.altexsoft.com/blog/business/price-forecasting-machine-l
5
earning-based-approaches-applied-to-electricity-flights-hotels-real-est
ate-and-stock-pricing/

10 MINS

5 MINS
TASK 3
DATA
EXPLORATION

DAY 5
EASY ADVANCED
INPUTS AND OUTPUTS

INPUTS OUTPUT
5
MAKE
MODEL
TYPE
ORIGIN
VEHICLE PRICE
DRIVETRAIN
ENGINESIZE ML MODEL (MSRP)
CYLINDERS
HORSEPOWER
MPG CITY MPG
HIGHWAY WEIGHT
WHEELBASE
LENGTH
DATA OVERVIEW

MODEL OUTPUT: MSRP


MANUFACTURER'S SUGGESTED RETAIL
PRICE
TASK 4
DATAROBOT DEMO:
DATA UPLOAD

DAY 5
EASY ADVANCED
DATAROBOT DEMO

5
• DataRobot is the leading end-to-end enterprise AI platform that automates
the process of building, training and deploying AI models at scale.

GO TO LINK: HTTPS://WWW.DATAROBOT.COM/ AND CLICK ON FREE TRIAL


DATAROBOT DEMO

5
ENTER YOUR INFORMATION
DATAROBOT DEMO
YOU’LL RECEIVE A CONFIRMATION E-MAIL
TO START YOUR 14-DAY TRIAL.
5
DATAROBOT DEMO
FILL OUT YOUR ROLE, INDUSTRY AND THEME PREFERENCE
5
DATAROBOT DEMO
SELECT CREATE AI MODELS
5
DATAROBOT DEMO
AFTER YOU’VE SIGNED UP, CLICK ON ML DEVELOPMENT TILE
5
DATAROBOT DEMO
CLICK ON LOCAL FILE TO UPLOAD THE DATA
5
DATAROBOT DEMO
UPLOAD “USED_VEHICLE_PRICES.CSV” FILE TO DATAROBOT.
5
DATAROBOT DEMO

5
ONCE YOU’VE UPLOADED YOUR LOCAL FILE,
YOU WILL BE ABLE TO SEE THIS SCREEN.
DATAROBOT DEMO
AFTER SCROLLING THROUGH, YOU’LL FIND SOME DATA QUALITY ALERTS.
SINCE THEY ARE NOT SIGNIFICANT AT THIS TIME, WE WILL IGNORE THEM. 5
DATAROBOT DEMO
WARNING MESSAGES ARE GENERALLY OUTLIERS
5
TASK 5
DATAROBOT DEMO:
DATA ANALYSIS

DAY 5
EASY ADVANCED
DATAROBOT DEMO

5
TO SELECT THE TARGET, MOVE THE CURSOR NEAR THE REQUIRED COLUMN AND CLICK
ON ‘MAKE AS TARGET’. YOU CAN FURTHER EXPLORE EACH FEATURE BY CLICKING ON
EACH COLUMN NAME.
DATAROBOT DEMO

5
DATAROBOT DEMO

5
DATAROBOT DEMO

5
DATAROBOT DEMO

5
DATAROBOT DEMO

5
DATAROBOT DEMO

5
DATAROBOT DEMO

5
AFTER EXPLORING THE FEATURES, BELOW THE START BUTTON, YOU CAN FIND
THE “ADVANCED OPTION”, YOU CAN SPECIFY THE NUMBER OF FOLDS FOR
VALIDATION AND CHOOSE THE HOLDOUT PERCENTAGE.

Additional reading materials: https://www.datarobot.com/wiki/training-validation-holdout/


DATA SPLIT INTO TRAINING AND TESTING

• Data set is generally divided into 80% for training and 20% for
testing.
• Sometimes, we might include validation dataset as well and then we
5
divide it into 60%, 20%, 20% segments for training, validation, and TRAINING
testing, respectively (numbers may vary). DATASET
o 1. Training set: used for gradient calculation and weight 60%
update. VALIDATION
o 2. Validation set: used for cross-validation to assess training DATASET
quality as training proceeds. Cross-validation is implemented 20%
to overcome over-fitting which occurs when algorithm focuses TESTING DATASET
20%
on training set details at cost of losing generalization ability.
o 3. Testing set (Holdout dataset): used for testing final trained
model.
TASK 6
DATAROBOT DEMO:
MODEL TRAINING

DAY 5
EASY ADVANCED
DATAROBOT DEMO

NOW CLICK ON THE START BUTTON TO BEGIN YOUR TRAINING.


5
DATAROBOT DEMO

5
NOW, YOU CAN SEE THE PROGRESS OF YOUR
TRAINING IN THE SIDE BAR.
DATAROBOT DEMO
WHILE THE TRAINING IS IN PROGRESS, YOU CAN SEE THE ASSOCIATION
BETWEEN FEATURES BY CLICKING ON FEATURE ASSOCIATION BOX. 5
DATAROBOT DEMO
BY CLICKING ON FEATURE ASSOCIATION PAIRS, YOU CAN SEE A
MUCH CLEARER ASSOCIATION BETWEEN TWO FEATURES. 5
DATAROBOT DEMO

5
TASK 7
DATAROBOT DEMO:
MODEL ASSESSMENT

DAY 5
EASY ADVANCED
DATAROBOT DEMO

5
ONCE THE TRAINING IS
COMPLETE, YOU CAN
VIEW THE
PERFORMANCE OF
DIFFERENT MODELS BY
CLICKING ON MODELS
AND THEN METRICS.
DATAROBOT DEMO

5
YOU CAN VIEW THE
ARCHITECTURE USED
BY THE MODEL UNDER
BLUEPRINT
DATAROBOT DEMO

5
FOR EXPLAINABILITY
PURPOSE, CLICK ON
FEATURE IMPACT.
DATAROBOT DEMO

5
CLICK ON FEATURE
EFFECTS.
DATAROBOT DEMO
ONCE YOU CLICK THE EVALUATION TAB, UNDER THE ABOVE
MODEL, YOU CAN SEE DIFFERENT WAYS TO EVALUATE THE MODEL. 5
TASK 8
DATAROBOT DEMO:
MODEL DEPLOYMENT

DAY 5
EASY ADVANCED
DATAROBOT DEMO

5
TO DEPLOY THE
MODEL, CLICK ON THE
“DEPLOY AUTOMODEL”
OPTION ON THE TOP.
THEN CLICK ON
“AUTOMODEL”.
DATAROBOT DEMO

5
AFTER DEPLOYING THE
MODEL, CLICK ON THE
CREATE APPLICATION
OPTION.
DATAROBOT DEMO

5
CLICK THE PREDICTOR
OPTION AND THEN
DEPLOY IT.
DATAROBOT DEMO

5
SELECT
“MODEL FROM
DEPLOYMENT”
AND CLICK
“LAUNCH”.
DATAROBOT DEMO

5
CLICK “OPEN”
DATAROBOT DEMO

5
YOU WILL ARRIVE AT
THIS PAGE AFTER
DEPLOYING THE
PREDICTOR.
DATAROBOT DEMO

5
SELECT VALUES AND
CLICK ON “GET
PREDICTION”
DATAROBOT DEMO

5
TASK 9
TECHNICALITIES

DAY 5
EASY ADVANCED
TECHNICALITIES

ENGINE SIZE TRAINED


MACHINE
LEARNING
VEHICLE MSRP
PRICE ($) 5
MODEL

$40,000

MSRP PRICE

8 CYLINDERS
TRAINING DATASET
(COLLECTED BY CAR
DEALERSHIP) ENGINE SIZE
TECHNICALITIES: SIMPLE LINEAR REGRESSION
• Goal is to obtain a relationship (model) between two variables only such
as vehicle price and engine size for example. 5
MODEL! (GOAL)
VEHICLE MSRP ($)

𝑦 =𝑏+ 𝑚∗ 𝑥

DEPENDANT VARIABLE INDEPENDENT VARIABLE


VEHICLE MSRP ($) ENGINE SIZE

ENGINE SIZE

56
TECHNICALITIES: MULTIPLE LINEAR REGRESSION


Multiple Linear Regression: examines relationship between more than two
variables.
Recall that Simple Linear regression is a statistical model that examines linear
relationship between two variables only.
5
• Each independent variable has its own corresponding coefficient.

𝑦 =𝑏 0 +𝑏1 ∗ 𝑥1 + 𝑏2 ∗ 𝑥2 +..+ 𝑏𝑛 𝑥 𝑛

DEPENDANT VARIABLES INDEPENDENT VARIABLES


VEHICLE MSRP ($) (ENGINE SIZE, MPG CITY, HORSEPOWER)
TECHNICALITIES: HOW TO OBTAIN MODEL
PARAMETERS? LEAST SUM OF SQUARES
• Least squares fitting is a way to find the best fit curve or line for a set of
points.
• The sum of the squares of the offsets (residuals) are used to estimate the
5
best fit curve or line.
• Least squares method is used to obtain the coefficients m and b.

(actual) 𝒚𝒊 − 𝐲𝒊
𝑑= ^
𝒅
𝒎𝒊𝒏 ∑ ( ^𝒚 𝒊 −𝒚 𝒊 )
𝟐
(estimated)
Price

MINIMUM (LEAST) SUM OF SQUARES

EngineSize
TECHNICALITIES: SIMPLE LINEAR REGRESSION:
ADDITIONAL READING MATERIAL

Additional Resources, Page #123: Additional Resources, Page #61:


5
http://www.cs.huji.ac.il/~shais/Understanding http://www-bcf.usc.edu/~gareth/ISL/ISLR%2
MachineLearning/understanding-machine-lear
ning-theory-algorithms.pdf 0Seventh%20Printing.pdf
TECHNICALITIES: MINI CHALLENGE

• Match the equations to the figures


below and explain why:
𝑦 =4 ∗ 𝑥 𝑦 =20 − 5∗ 𝑥 5

𝒚
𝒚

𝒙 𝒙
TASK 10
FINAL PROJECT - PART
A

DAY 5
EASY ADVANCED
FINAL PROJECT

• Using DataRobot, select the best model that has been recommended for
deployment and tune its hyperparameters.
• Set the upper and lower bounds of the model hyperparameter such as
5
“learning rate”.
• Train the model and assess its performance compared to other models
on the model leaderboard.
FINAL PROJECT

5
YOU CAN TUNE THE HYPERPARAMETERS BY CLICKING ON
“ADVANCED TUNING” OPTION. YOU CAN SPECIFY THE VALUES OF
PREDICTION PARAMETER.
FINAL PROJECT
ONCE YOU MAKE CHANGES TO THE PARAMETER, CLICK ON
UPDATE PARAMETER.
5
FINAL PROJECT

5
CLICK ON “BEGIN TUNING” TO START
HYPERPARAMETERS TUNING.
FINAL PROJECT
EVEN AFTER PARAMETER TUNING, THE INITIAL MODEL SEEMS TO BE
PERFORMING WELL, SO WE ARE STICKING WITH THE ORIGINAL MODEL. 5
TASK 11
FINAL PROJECT – PART
B

DAY 5
EASY ADVANCED
FINAL PROJECT

• Use the attached insurance premium prediction dataset entitled:


“final_project_insurance_data.csv”, upload the data to DataRobot.
• Perform exploratory data analysis.
5
• Train multiple models in DataRobot.
• Assess trained model performance.
• Deploy the best model and perform inference.

You might also like