0% found this document useful (0 votes)
240 views20 pages

Oracle AutoML

Uploaded by

techiesid02
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
240 views20 pages

Oracle AutoML

Uploaded by

techiesid02
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

• Oracle Cloud Infrastructure

Oracle AutoML
Data
access
Data exploration
Monitoring, And
refresh, preparation
retirement

Machine Learning Life Business


Problem
Cycle: AutoML Modeling
Deployment

AutoML
Validation
AutoML: What and Why
• Building a successful machine learning model requires a lot of Model Selection
iterations and experimentation.

• Developers rarely achieve a model with an optimal set of


hyperparameters in the first iteration, which provides an opportunity for
ML automation.

• AutoML combines the processes of choosing and refining models, and


tuning parameters. This optimizes the outcome of the learning.

Hyperparameter Model
Tuning Assessment
AutoML Approaches

Bayesian Recommender Genetic


Optimization System Programming
Probabilistic model captures System maintains a record of Technique of evolving programs,
different hyperparameter the best configuration found for starting from a population of unfit
configurations and their each data set it has previously (usually random) programs, fit for a
performance. encountered. particular task by applying
operations
Oracle Automated Machine Learning (Oracle AutoML)
• AutoML solution from Oracle

Non-iterative
and faster Leverages
metalearning
Avoids cold-start problems
Benefits
Automates many routine but time-
consuming steps, and increases data
scientists’ productivity

Automates the process of feature


selection, model/algorithm selection,
and hyperparameter tuning

Reduces the overall compute time


required to deliver machine learning
models
Oracle AutoML Workflow
Selects a model from a large number of
viable candidate models

Tunes the hyperparameters for


each model

Selects predictive features to speed up the


pipeline and reduce overfitting
Data set
Data Scientist Tuned Model
Features > Labels

Ensures the model trained is generalized


and works for unseen data
AutoML Pipeline

Algorithm Adaptive Feature Hyperparameter


Tuning
Selection Sampling Selection

Data set Identify the best Identify the right De-noise the data Auto tune
algorithms for the sample size and and reduce the hyperparameters for Tuned Model
data and adjust for number of the best model
problem; faster unbalanced data. features. accuracy.
than exhaustive
search.
Algorithm Selection
Algorithm that yields max score is identified.

Algorithm Algorithms are ranked based on predicted


Selection scores.

Automated algorithm selection uses


metalearning.

Top K Algorithms
Algorithms with the highest scores are later
used for model tuning.
How Algorithms Are Selected

Extract Dataset Rank Algorithms Top K Algorithms


New Data set
Characteristic Based on Predicted
Scores
Invoke Score
Prediction Models
Adaptive Sampling

Identifies the right sampling percentage


Adaptive
Sampling
Speeds up model building

Identify Detects unbalanced data sets that can


Optimized Sample cause poor models
How Is Adaptive Sampling Done?
Optimize Until
Convergence Is
Achieved

New Data set

Identify Optimized Reduced Data


Sample set

Measure
ML Algorithms
Model Score
Feature Selection

Selects a subset of features that are most


predictive of the target
Feature
Selection
Reduces the number of features used in later
pipeline stages

Predict Best Feature Speeds up training without losing predictive


Set performance
How Is Feature Selection Done?
Repeat for
Extract Data set Multiple Ranking
Characteristics Algorithms

New Data set

Predict Best Measure Reduced Data


Feature Set Model Score set

ML Algorithms Rank Features


Hyperparameter Tuning

Filters for optimal configuration of the


shortlisted algorithms
Hyperparameter
Tuning

Tunes multiple machine learning models

Prediction Tunes each selected algorithm to find


Models hyperparameter settings
How Is Hyperparameter Tuning Done?
Optimize until
Measure
Convergence
Model Score

New Data set

Hyperparameter Hyperparameter Tuned Model


Choice Choice

Prediction
ML Algorithms
Models
Building with Oracle AutoML

• OracleAutoMLProvider delegates model training to the ads.automl


package from Oracle Accelerated Data Science Python SDK.

• OracleAutoMLProvider class supports two arguments:

• n_jobs: Specifies the degree of parallelism for Oracle AutoML. The


default is -1, which means all cores will be used.

• Loglevel: Verbosity of output for Oracle AutoML

• Results can be visualized at each stage of the AutoML pipeline.


Building with Oracle AutoML
• The Oracle AutoML process summarizes the optimization process by providing:
• Training data information
• Pipeline information with selected features, best choices, and respective hyperparameters
• Best model trial information

• Adaptive sampling will not run and visualizations will not be generated if data points are < 1000.

• model_list allows you to control what algorithms AutoML will consider during the
optimization process.

• score_metric allows you to provide your own scoring metric as a string from a list of metrics
or as a user-defined function. Default metrics are:
• Binary Classification: roc_auc
• Multiclass Classification: recall_macro
• Regression: neg_mean_squared_error
Oracle AutoML: Time Budget
› The Oracle AutoML tool also supports a user-given time budget in seconds.

› AutoML tries to terminate computation as soon as the time budget is exhausted by returning the current best model.

Time budget exhausts before preprocessing completes: A Naive


Scenario 1
Bayes model is returned for classification and Linear Regression
for regression.

Time budget exhausts before algorithm selection completes:


Scenario 2 Partial results for algorithm selection are used to evaluate the AutoML
best candidate that is returned. Pipeline

Time budget exhausts before hyperparameter tuning


Scenario 3 completes: Current best known hyperparameter
configuration is returned.
Oracle AutoML:
Minimum Feature List

AutoML ensures through min_features that the features in the list are part of the
final model that it creates, and these are not dropped during the feature selection
phase.

▪ If int, 0 < min_features <= n_features

▪ If float, 0 < min_features <= 1.0

▪ If list, names of features to keep. For example, [‘a’, ‘b’] means keep
features ‘a’ and ‘b’.

You might also like