SlideShare a Scribd company logo
DRIVERLESS ML
Automation of ML using Driverless API
Sayantan Ghosh
Kalinga Institute Of Industrial
technology
What is Driverless API?
Driverless API is a automation Software to process large amount
of datasets(Preprocessing & Cleaning) in few seconds.
It has the capability to apply Machine Learning Algorithms
(Logistic Regerssion,KNN,Decision Tree,Naive Bayes,SVM)
and ANN autonomously in the dataset without human interface
and to visualize the accuracies of the algorithms,Features'
Importance,F1 score,recall score.
2
Key Capabilities of Driverless API
1
2
It can produce interactive
graphical visualization using
advanced pygal library.
It can preprocess the dataset
very efficiently.( Ex-It can handle
categorical Data as well as NaN
or missing Values).
It can do Feature Scaling very
efficienty to increase
accuracy and acceptbility.
The API can process
dataset analization in a
very less amount of time.
3
4
It can be used for Binary as well as
Multiclass Classification, Churn
Modeling, Credit Card Fraud
Detection, Marketing Analysis.
It generates Dataset info as well
as the features that are irelevant
for the model prediction.
Data Preprocessingt
Visualization
Time Efficient
Feature Scaling
5
6
3
Data Collection
Phase
Feature
Selection.
Data
Preprocessing
Algorithm
Implementation.
Development Process of the
Driverless API Input
Dataset
• The Flask API consists of
a 5-Stage pipeline
process from user Input
to Output Phase.
• It uses Flask framework to
implement python scripts.
• The API web Interface is
developed using HTML5
& CSS3.
• The 5 stages are- 5 Output Visualization
Phase.Output
4
Methodology &
Implementations
WorkFlow Diagram of the API
5
Data Collection Stage
.csv Split
Amount Epoch
Feature Selection &
Dimensionality Reduction
Compute the featurs' Importance and select the
relevant features based on the features' importance.
Data Preprocessing
(Categorical,Missing Value Handling)
Classification Algorithms
All the ML Classifiers are implemented into the dataset
through K-Fold Cross Validation and results are stored.
Analyzation Report &
Visualization of Predicted results
using pygal
All the Categorical datas are One Hot Encoded and
Missing Values are handed using mean values.
At the Input Phase the user will Provide the .csv
file, Split amount of the dataset and the epoch
Count and the optimizer Algorithms.
Keras 2 Flask 3 Scikit-Learn
Keras is used for implementing the
Artificial Neural Network.
Flask is used for implementing
the Web API.
Scikit-lEarn is used for Implementing the
overall Classification Algorithms and overall
inn the preprocessing Phase.
TECHNOLOGIES USED
6
Pygal is used for implementing the
visualizations using Support Vector
Graphics
1
4 Pygal
Numpy is used for computing the
numeracal Operations.
5 Numpy5
Pandas is used for Implementing all the
DataFrame processing.
6 Pandas
Automation Of Classification Algorithms
7
For the Automation Process I have used 6 Classification Algorithms and each Algorithm
is feed into the K-Fold Cross Validation into 10 Splits.
Accuracy ,Recall,F1 Score
Distribution of Classification
Algorithms which can help to
choose proper classifiers in less
amount of Time.
K-Fold
Cross
Validation
(10 splits)
Result Analysis On
Various Datasets Dataset : titanic_train.csv
Target Column : Survived
Split Amount: 0.3
Epoch Count: 100
8
79.01 80.36
73.63
80.26
62.86
82.27
0
10
20
30
40
50
60
70
80
90
Logistic
Regression
KNN Decision Tree Random
Forest
Naive Bayes SVM
Logistic Regression KNN Decision Tree Random Forest Naive Bayes SVM
79.01 80.36
73.63
80.26
62.86
82.27
0
10
20
30
40
50
60
70
80
90
Logistic
Regression
KNN Decision
Tree
Random
Forest
Naive
Bayes
SVM
Accuracies of Classifiers on titanic datset.
Logistic Regression KNN Decision Tree Random Forest Naive Bayes SVM
Dataset : brain_tumour_classification.csv
Target Column : diagnosis
Split Amount: 0.3
Epoch Count: 100
Auto-Visualization of
Feature Importance and Data details
The API is proved to analyze and visualize the feature-
Importances much more efficiently.
It is the Feature Importance Report of the titanic
Datset.
9
Dataset Info Table
Feature ColumnsImportances
Future Applications
of the API
10
Financial Analysis
and Bank Churn
Model
Business
Modeling
Health Care
Applications
Weather
Prediction
Further
Improvements
11
01
02
03
04
Integrating CNN and OpenCV so
that it can perform Image
Classification.
Image Processing and Disease
Recognition
Natural Language Processing.Text
Data Classification,Spam filter
Detection
Audio Generation &
Processing
CONCLUSION
Machine learning has become one of the main engines of the current era. The
production pipeline of a machine learning models passe through different phases
and stages that require wide knowledge of several available tools, and algorithms.
However, as the scale of data produced daily is increasing continuously at an
exponential scale, it has become essential to automate this process.
In this project, I have covered comprehensively the state-of-the-art research
effort in the domain of Driverless ML frameworks.
12
Video Representation
13
THANK
YOU!
Sayantan
Ghosh
Kalinga Institute Of Industrial
Technology
Email
gsayantan1999@gmail.com

More Related Content

PPTX
Machine Learning Automation using Flask API
PPTX
Scikit Learn intro
PPTX
Scikit-Learn Tutorial | Machine Learning With Scikit-Learn | Sklearn | Python...
PPTX
MATLAB Research Thesis Tutor
PPTX
Production ready big ml workflows from zero to hero daniel marcous @ waze
PPTX
MLOps and Data Quality: Deploying Reliable ML Models in Production
PPTX
An introduction to Machine Learning with scikit-learn (October 2018)
PDF
MLflow and Azure Machine Learning—The Power Couple for ML Lifecycle Management
Machine Learning Automation using Flask API
Scikit Learn intro
Scikit-Learn Tutorial | Machine Learning With Scikit-Learn | Sklearn | Python...
MATLAB Research Thesis Tutor
Production ready big ml workflows from zero to hero daniel marcous @ waze
MLOps and Data Quality: Deploying Reliable ML Models in Production
An introduction to Machine Learning with scikit-learn (October 2018)
MLflow and Azure Machine Learning—The Power Couple for ML Lifecycle Management

What's hot (20)

PDF
MLOps Using MLflow
PPTX
GraphQL Summit 2019 - Configuration Driven Data as a Service Gateway with Gra...
PPTX
Introduction to PredictionIO
PPTX
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
PDF
"How Pirelli uses Domino and Plotly for Smart Manufacturing" by Alberto Arrig...
PPTX
Rest microservice ml_deployment_ntalagala_ai_conf_2019
PDF
Building A Feature Factory
PDF
Productionizing Machine Learning in Our Health and Wellness Marketplace
PDF
MLSD18. Automating Machine Learning Workflows
PDF
Scalable Time Series Forecasting and Monitoring using Apache Spark and Elasti...
PDF
Productionzing ML Model Using MLflow Model Serving
PDF
PredictionIO – A Machine Learning Server in Scala – SF Scala
PDF
PDF
MLSEV. BigML Workshop II
PDF
The Power of Auto ML and How Does it Work
PDF
The A-Z of Data: Introduction to MLOps
PPTX
Configuration Management and Deployment
PDF
Aniket_Gaikwad_ML
PDF
ReStream: Accelerating Backtesting and Stream Replay with Serial-Equivalent P...
PDF
Keynote 1 the rise of stream processing for data management & micro serv...
MLOps Using MLflow
GraphQL Summit 2019 - Configuration Driven Data as a Service Gateway with Gra...
Introduction to PredictionIO
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
"How Pirelli uses Domino and Plotly for Smart Manufacturing" by Alberto Arrig...
Rest microservice ml_deployment_ntalagala_ai_conf_2019
Building A Feature Factory
Productionizing Machine Learning in Our Health and Wellness Marketplace
MLSD18. Automating Machine Learning Workflows
Scalable Time Series Forecasting and Monitoring using Apache Spark and Elasti...
Productionzing ML Model Using MLflow Model Serving
PredictionIO – A Machine Learning Server in Scala – SF Scala
MLSEV. BigML Workshop II
The Power of Auto ML and How Does it Work
The A-Z of Data: Introduction to MLOps
Configuration Management and Deployment
Aniket_Gaikwad_ML
ReStream: Accelerating Backtesting and Stream Replay with Serial-Equivalent P...
Keynote 1 the rise of stream processing for data management & micro serv...
Ad

Similar to Driverless Machine Learning Web App (20)

PDF
Technovision
PPTX
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
PDF
A survey on Machine Learning In Production (July 2018)
PDF
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
PDF
Robust and declarative machine learning pipelines for predictive buying at Ba...
PPTX
Dive into H2O: NYC
PDF
Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...
PDF
Introduction to and Extending Spark ML
PDF
Deploying End-to-End Deep Learning Pipelines with ONNX
PPTX
End-to-End Deep Learning Deployment with ONNX
PPTX
Apache Spark MLlib
PDF
Automatic image moderation in classifieds, Jarosław Szymczak
PDF
Automatic image moderation in classifieds
PDF
Ibm machine learning for z os
PDF
Machine learning pipeline with spark ml
PPTX
Introduction to Spark ML
PDF
LF_APIStrat17_Diving Deep into the API Ocean with Open Source Deep Learning T...
PDF
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
PPTX
Apache Spark MLlib - Random Foreset and Desicion Trees
PDF
Spark DataFrames and ML Pipelines
Technovision
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
A survey on Machine Learning In Production (July 2018)
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
Robust and declarative machine learning pipelines for predictive buying at Ba...
Dive into H2O: NYC
Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...
Introduction to and Extending Spark ML
Deploying End-to-End Deep Learning Pipelines with ONNX
End-to-End Deep Learning Deployment with ONNX
Apache Spark MLlib
Automatic image moderation in classifieds, Jarosław Szymczak
Automatic image moderation in classifieds
Ibm machine learning for z os
Machine learning pipeline with spark ml
Introduction to Spark ML
LF_APIStrat17_Diving Deep into the API Ocean with Open Source Deep Learning T...
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Apache Spark MLlib - Random Foreset and Desicion Trees
Spark DataFrames and ML Pipelines
Ad

Recently uploaded (20)

PPTX
New ISO 27001_2022 standard and the changes
DOCX
Factor Analysis Word Document Presentation
PDF
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PDF
Microsoft Core Cloud Services powerpoint
PDF
Introduction to Data Science and Data Analysis
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
importance of Data-Visualization-in-Data-Science. for mba studnts
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
Topic 5 Presentation 5 Lesson 5 Corporate Fin
PPTX
Pilar Kemerdekaan dan Identi Bangsa.pptx
PPTX
IMPACT OF LANDSLIDE.....................
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
How to run a consulting project- client discovery
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PPTX
Database Infoormation System (DBIS).pptx
PDF
Global Data and Analytics Market Outlook Report
PPTX
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
New ISO 27001_2022 standard and the changes
Factor Analysis Word Document Presentation
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
Optimise Shopper Experiences with a Strong Data Estate.pdf
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Microsoft Core Cloud Services powerpoint
Introduction to Data Science and Data Analysis
Qualitative Qantitative and Mixed Methods.pptx
importance of Data-Visualization-in-Data-Science. for mba studnts
IBA_Chapter_11_Slides_Final_Accessible.pptx
Topic 5 Presentation 5 Lesson 5 Corporate Fin
Pilar Kemerdekaan dan Identi Bangsa.pptx
IMPACT OF LANDSLIDE.....................
Acceptance and paychological effects of mandatory extra coach I classes.pptx
How to run a consulting project- client discovery
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
STERILIZATION AND DISINFECTION-1.ppthhhbx
Database Infoormation System (DBIS).pptx
Global Data and Analytics Market Outlook Report
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...

Driverless Machine Learning Web App

  • 1. DRIVERLESS ML Automation of ML using Driverless API Sayantan Ghosh Kalinga Institute Of Industrial technology
  • 2. What is Driverless API? Driverless API is a automation Software to process large amount of datasets(Preprocessing & Cleaning) in few seconds. It has the capability to apply Machine Learning Algorithms (Logistic Regerssion,KNN,Decision Tree,Naive Bayes,SVM) and ANN autonomously in the dataset without human interface and to visualize the accuracies of the algorithms,Features' Importance,F1 score,recall score. 2
  • 3. Key Capabilities of Driverless API 1 2 It can produce interactive graphical visualization using advanced pygal library. It can preprocess the dataset very efficiently.( Ex-It can handle categorical Data as well as NaN or missing Values). It can do Feature Scaling very efficienty to increase accuracy and acceptbility. The API can process dataset analization in a very less amount of time. 3 4 It can be used for Binary as well as Multiclass Classification, Churn Modeling, Credit Card Fraud Detection, Marketing Analysis. It generates Dataset info as well as the features that are irelevant for the model prediction. Data Preprocessingt Visualization Time Efficient Feature Scaling 5 6 3
  • 4. Data Collection Phase Feature Selection. Data Preprocessing Algorithm Implementation. Development Process of the Driverless API Input Dataset • The Flask API consists of a 5-Stage pipeline process from user Input to Output Phase. • It uses Flask framework to implement python scripts. • The API web Interface is developed using HTML5 & CSS3. • The 5 stages are- 5 Output Visualization Phase.Output 4
  • 5. Methodology & Implementations WorkFlow Diagram of the API 5 Data Collection Stage .csv Split Amount Epoch Feature Selection & Dimensionality Reduction Compute the featurs' Importance and select the relevant features based on the features' importance. Data Preprocessing (Categorical,Missing Value Handling) Classification Algorithms All the ML Classifiers are implemented into the dataset through K-Fold Cross Validation and results are stored. Analyzation Report & Visualization of Predicted results using pygal All the Categorical datas are One Hot Encoded and Missing Values are handed using mean values. At the Input Phase the user will Provide the .csv file, Split amount of the dataset and the epoch Count and the optimizer Algorithms.
  • 6. Keras 2 Flask 3 Scikit-Learn Keras is used for implementing the Artificial Neural Network. Flask is used for implementing the Web API. Scikit-lEarn is used for Implementing the overall Classification Algorithms and overall inn the preprocessing Phase. TECHNOLOGIES USED 6 Pygal is used for implementing the visualizations using Support Vector Graphics 1 4 Pygal Numpy is used for computing the numeracal Operations. 5 Numpy5 Pandas is used for Implementing all the DataFrame processing. 6 Pandas
  • 7. Automation Of Classification Algorithms 7 For the Automation Process I have used 6 Classification Algorithms and each Algorithm is feed into the K-Fold Cross Validation into 10 Splits. Accuracy ,Recall,F1 Score Distribution of Classification Algorithms which can help to choose proper classifiers in less amount of Time. K-Fold Cross Validation (10 splits)
  • 8. Result Analysis On Various Datasets Dataset : titanic_train.csv Target Column : Survived Split Amount: 0.3 Epoch Count: 100 8 79.01 80.36 73.63 80.26 62.86 82.27 0 10 20 30 40 50 60 70 80 90 Logistic Regression KNN Decision Tree Random Forest Naive Bayes SVM Logistic Regression KNN Decision Tree Random Forest Naive Bayes SVM 79.01 80.36 73.63 80.26 62.86 82.27 0 10 20 30 40 50 60 70 80 90 Logistic Regression KNN Decision Tree Random Forest Naive Bayes SVM Accuracies of Classifiers on titanic datset. Logistic Regression KNN Decision Tree Random Forest Naive Bayes SVM Dataset : brain_tumour_classification.csv Target Column : diagnosis Split Amount: 0.3 Epoch Count: 100
  • 9. Auto-Visualization of Feature Importance and Data details The API is proved to analyze and visualize the feature- Importances much more efficiently. It is the Feature Importance Report of the titanic Datset. 9 Dataset Info Table Feature ColumnsImportances
  • 10. Future Applications of the API 10 Financial Analysis and Bank Churn Model Business Modeling Health Care Applications Weather Prediction
  • 11. Further Improvements 11 01 02 03 04 Integrating CNN and OpenCV so that it can perform Image Classification. Image Processing and Disease Recognition Natural Language Processing.Text Data Classification,Spam filter Detection Audio Generation & Processing
  • 12. CONCLUSION Machine learning has become one of the main engines of the current era. The production pipeline of a machine learning models passe through different phases and stages that require wide knowledge of several available tools, and algorithms. However, as the scale of data produced daily is increasing continuously at an exponential scale, it has become essential to automate this process. In this project, I have covered comprehensively the state-of-the-art research effort in the domain of Driverless ML frameworks. 12