Day 5. Product Price Prediction Using DataRobot
Day 5. Product Price Prediction Using DataRobot
NO CODE AI/ML
CHALLENGE
DAY 5
TASK 1
PROJECT CARD
AND DEMO
DAY 5
EASY ADVANCED
PROJECT CARD
GOAL:
• Build, train, test and deploy a machine learning model
to predict used car prices based on their features
5
TOOL:
• DataRobot
DATA:
• INPUTS:
o Make, Model, Type, Origin, Drivetrain, Invoice,
EngineSize, Cylinders, Horsepower, MPG_City,
MPG_Highway, Weight, Wheelbase, and Length
• OUTPUT:
o MSRP (Price) Image
ImageSource: https://www.flickr.com/photos/pasa/6757993805
Source: https://www.flickr.com/photos/pasa/6757993805
Dataset Source: https://www.kaggle.com/ljanjughazyan/cars1
Dataset Source: https://www.kaggle.com/ljanjughazyan/cars1
PROJECT DEMO
5
TASK 2
SUCCESS
STORIES
DAY 5
EASY ADVANCED
SUCCESS STORIES
• Price prediction of products and services is critical for any company to
maximize revenues and reduce costs.
• Fareboom.com is an innovative tool that leverages machine learning to
5
predict flight prices. The tool has been developed by AltexSoft.
• The fare forecast feature has been developed to help users make better
purchasing decisions.
• The tool can guide customers to select the best time to purchase a flight.
• The tool is built on a self learning machine learning algorithm that can
predict future price movements while taking into account historical data,
airlines deals, demand, and seasonal effects.
• Great case studies: https://www.altexsoft.com/case-studies/
• Fare price prediction tool:
https://www.altexsoft.com/case-studies/travel/altexsoft-creates-unique-data
-science-and-analytics-based-fare-predictor-tool-to-forecast-price-movemen
ts/
Source: https://www.altexsoft.com/blog/datascience/data-science-and-ai-in-the-travel-industry-9-real-life-use-cases/
READING TIME & QUIZ: AI/ML APPLICATIONS
IN PRICE FORECASTING
• Please read the article below and answer the following quiz.
o Link to Article:
https://www.altexsoft.com/blog/business/price-forecasting-machine-l
5
earning-based-approaches-applied-to-electricity-flights-hotels-real-est
ate-and-stock-pricing/
10 MINS
5 MINS
TASK 3
DATA
EXPLORATION
DAY 5
EASY ADVANCED
INPUTS AND OUTPUTS
INPUTS OUTPUT
5
MAKE
MODEL
TYPE
ORIGIN
VEHICLE PRICE
DRIVETRAIN
ENGINESIZE ML MODEL (MSRP)
CYLINDERS
HORSEPOWER
MPG CITY MPG
HIGHWAY WEIGHT
WHEELBASE
LENGTH
DATA OVERVIEW
DAY 5
EASY ADVANCED
DATAROBOT DEMO
5
• DataRobot is the leading end-to-end enterprise AI platform that automates
the process of building, training and deploying AI models at scale.
5
ENTER YOUR INFORMATION
DATAROBOT DEMO
YOU’LL RECEIVE A CONFIRMATION E-MAIL
TO START YOUR 14-DAY TRIAL.
5
DATAROBOT DEMO
FILL OUT YOUR ROLE, INDUSTRY AND THEME PREFERENCE
5
DATAROBOT DEMO
SELECT CREATE AI MODELS
5
DATAROBOT DEMO
AFTER YOU’VE SIGNED UP, CLICK ON ML DEVELOPMENT TILE
5
DATAROBOT DEMO
CLICK ON LOCAL FILE TO UPLOAD THE DATA
5
DATAROBOT DEMO
UPLOAD “USED_VEHICLE_PRICES.CSV” FILE TO DATAROBOT.
5
DATAROBOT DEMO
5
ONCE YOU’VE UPLOADED YOUR LOCAL FILE,
YOU WILL BE ABLE TO SEE THIS SCREEN.
DATAROBOT DEMO
AFTER SCROLLING THROUGH, YOU’LL FIND SOME DATA QUALITY ALERTS.
SINCE THEY ARE NOT SIGNIFICANT AT THIS TIME, WE WILL IGNORE THEM. 5
DATAROBOT DEMO
WARNING MESSAGES ARE GENERALLY OUTLIERS
5
TASK 5
DATAROBOT DEMO:
DATA ANALYSIS
DAY 5
EASY ADVANCED
DATAROBOT DEMO
5
TO SELECT THE TARGET, MOVE THE CURSOR NEAR THE REQUIRED COLUMN AND CLICK
ON ‘MAKE AS TARGET’. YOU CAN FURTHER EXPLORE EACH FEATURE BY CLICKING ON
EACH COLUMN NAME.
DATAROBOT DEMO
5
DATAROBOT DEMO
5
DATAROBOT DEMO
5
DATAROBOT DEMO
5
DATAROBOT DEMO
5
DATAROBOT DEMO
5
DATAROBOT DEMO
5
AFTER EXPLORING THE FEATURES, BELOW THE START BUTTON, YOU CAN FIND
THE “ADVANCED OPTION”, YOU CAN SPECIFY THE NUMBER OF FOLDS FOR
VALIDATION AND CHOOSE THE HOLDOUT PERCENTAGE.
• Data set is generally divided into 80% for training and 20% for
testing.
• Sometimes, we might include validation dataset as well and then we
5
divide it into 60%, 20%, 20% segments for training, validation, and TRAINING
testing, respectively (numbers may vary). DATASET
o 1. Training set: used for gradient calculation and weight 60%
update. VALIDATION
o 2. Validation set: used for cross-validation to assess training DATASET
quality as training proceeds. Cross-validation is implemented 20%
to overcome over-fitting which occurs when algorithm focuses TESTING DATASET
20%
on training set details at cost of losing generalization ability.
o 3. Testing set (Holdout dataset): used for testing final trained
model.
TASK 6
DATAROBOT DEMO:
MODEL TRAINING
DAY 5
EASY ADVANCED
DATAROBOT DEMO
5
NOW, YOU CAN SEE THE PROGRESS OF YOUR
TRAINING IN THE SIDE BAR.
DATAROBOT DEMO
WHILE THE TRAINING IS IN PROGRESS, YOU CAN SEE THE ASSOCIATION
BETWEEN FEATURES BY CLICKING ON FEATURE ASSOCIATION BOX. 5
DATAROBOT DEMO
BY CLICKING ON FEATURE ASSOCIATION PAIRS, YOU CAN SEE A
MUCH CLEARER ASSOCIATION BETWEEN TWO FEATURES. 5
DATAROBOT DEMO
5
TASK 7
DATAROBOT DEMO:
MODEL ASSESSMENT
DAY 5
EASY ADVANCED
DATAROBOT DEMO
5
ONCE THE TRAINING IS
COMPLETE, YOU CAN
VIEW THE
PERFORMANCE OF
DIFFERENT MODELS BY
CLICKING ON MODELS
AND THEN METRICS.
DATAROBOT DEMO
5
YOU CAN VIEW THE
ARCHITECTURE USED
BY THE MODEL UNDER
BLUEPRINT
DATAROBOT DEMO
5
FOR EXPLAINABILITY
PURPOSE, CLICK ON
FEATURE IMPACT.
DATAROBOT DEMO
5
CLICK ON FEATURE
EFFECTS.
DATAROBOT DEMO
ONCE YOU CLICK THE EVALUATION TAB, UNDER THE ABOVE
MODEL, YOU CAN SEE DIFFERENT WAYS TO EVALUATE THE MODEL. 5
TASK 8
DATAROBOT DEMO:
MODEL DEPLOYMENT
DAY 5
EASY ADVANCED
DATAROBOT DEMO
5
TO DEPLOY THE
MODEL, CLICK ON THE
“DEPLOY AUTOMODEL”
OPTION ON THE TOP.
THEN CLICK ON
“AUTOMODEL”.
DATAROBOT DEMO
5
AFTER DEPLOYING THE
MODEL, CLICK ON THE
CREATE APPLICATION
OPTION.
DATAROBOT DEMO
5
CLICK THE PREDICTOR
OPTION AND THEN
DEPLOY IT.
DATAROBOT DEMO
5
SELECT
“MODEL FROM
DEPLOYMENT”
AND CLICK
“LAUNCH”.
DATAROBOT DEMO
5
CLICK “OPEN”
DATAROBOT DEMO
5
YOU WILL ARRIVE AT
THIS PAGE AFTER
DEPLOYING THE
PREDICTOR.
DATAROBOT DEMO
5
SELECT VALUES AND
CLICK ON “GET
PREDICTION”
DATAROBOT DEMO
5
TASK 9
TECHNICALITIES
DAY 5
EASY ADVANCED
TECHNICALITIES
$40,000
MSRP PRICE
8 CYLINDERS
TRAINING DATASET
(COLLECTED BY CAR
DEALERSHIP) ENGINE SIZE
TECHNICALITIES: SIMPLE LINEAR REGRESSION
• Goal is to obtain a relationship (model) between two variables only such
as vehicle price and engine size for example. 5
MODEL! (GOAL)
VEHICLE MSRP ($)
𝑦 =𝑏+ 𝑚∗ 𝑥
ENGINE SIZE
56
TECHNICALITIES: MULTIPLE LINEAR REGRESSION
•
•
Multiple Linear Regression: examines relationship between more than two
variables.
Recall that Simple Linear regression is a statistical model that examines linear
relationship between two variables only.
5
• Each independent variable has its own corresponding coefficient.
𝑦 =𝑏 0 +𝑏1 ∗ 𝑥1 + 𝑏2 ∗ 𝑥2 +..+ 𝑏𝑛 𝑥 𝑛
(actual) 𝒚𝒊 − 𝐲𝒊
𝑑= ^
𝒅
𝒎𝒊𝒏 ∑ ( ^𝒚 𝒊 −𝒚 𝒊 )
𝟐
(estimated)
Price
EngineSize
TECHNICALITIES: SIMPLE LINEAR REGRESSION:
ADDITIONAL READING MATERIAL
𝒚
𝒚
𝒙 𝒙
TASK 10
FINAL PROJECT - PART
A
DAY 5
EASY ADVANCED
FINAL PROJECT
• Using DataRobot, select the best model that has been recommended for
deployment and tune its hyperparameters.
• Set the upper and lower bounds of the model hyperparameter such as
5
“learning rate”.
• Train the model and assess its performance compared to other models
on the model leaderboard.
FINAL PROJECT
5
YOU CAN TUNE THE HYPERPARAMETERS BY CLICKING ON
“ADVANCED TUNING” OPTION. YOU CAN SPECIFY THE VALUES OF
PREDICTION PARAMETER.
FINAL PROJECT
ONCE YOU MAKE CHANGES TO THE PARAMETER, CLICK ON
UPDATE PARAMETER.
5
FINAL PROJECT
5
CLICK ON “BEGIN TUNING” TO START
HYPERPARAMETERS TUNING.
FINAL PROJECT
EVEN AFTER PARAMETER TUNING, THE INITIAL MODEL SEEMS TO BE
PERFORMING WELL, SO WE ARE STICKING WITH THE ORIGINAL MODEL. 5
TASK 11
FINAL PROJECT – PART
B
DAY 5
EASY ADVANCED
FINAL PROJECT