Driverless Machine Learning Web App

DRIVERLESS ML
Automation of ML using Driverless API
Sayantan Ghosh
Kalinga Institute Of Industrial
technology

What is Driverless API?
Driverless API is a automation Software to process large amount
of datasets(Preprocessing & Cleaning) in few seconds.
It has the capability to apply Machine Learning Algorithms
(Logistic Regerssion,KNN,Decision Tree,Naive Bayes,SVM)
and ANN autonomously in the dataset without human interface
and to visualize the accuracies of the algorithms,Features'
Importance,F1 score,recall score.
2

Key Capabilities of Driverless API
1
2
It can produce interactive
graphical visualization using
advanced pygal library.
It can preprocess the dataset
very efficiently.( Ex-It can handle
categorical Data as well as NaN
or missing Values).
It can do Feature Scaling very
efficienty to increase
accuracy and acceptbility.
The API can process
dataset analization in a
very less amount of time.
3
4
It can be used for Binary as well as
Multiclass Classification, Churn
Modeling, Credit Card Fraud
Detection, Marketing Analysis.
It generates Dataset info as well
as the features that are irelevant
for the model prediction.
Data Preprocessingt
Visualization
Time Efficient
Feature Scaling
5
6
3

Data Collection
Phase
Feature
Selection.
Data
Preprocessing
Algorithm
Implementation.
Development Process of the
Driverless API Input
Dataset
• The Flask API consists of
a 5-Stage pipeline
process from user Input
to Output Phase.
• It uses Flask framework to
implement python scripts.
• The API web Interface is
developed using HTML5
& CSS3.
• The 5 stages are- 5 Output Visualization
Phase.Output
4

Methodology &
Implementations
WorkFlow Diagram of the API
5
Data Collection Stage
.csv Split
Amount Epoch
Feature Selection &
Dimensionality Reduction
Compute the featurs' Importance and select the
relevant features based on the features' importance.
Data Preprocessing
(Categorical,Missing Value Handling)
Classification Algorithms
All the ML Classifiers are implemented into the dataset
through K-Fold Cross Validation and results are stored.
Analyzation Report &
Visualization of Predicted results
using pygal
All the Categorical datas are One Hot Encoded and
Missing Values are handed using mean values.
At the Input Phase the user will Provide the .csv
file, Split amount of the dataset and the epoch
Count and the optimizer Algorithms.

Keras 2 Flask 3 Scikit-Learn
Keras is used for implementing the
Artificial Neural Network.
Flask is used for implementing
the Web API.
Scikit-lEarn is used for Implementing the
overall Classification Algorithms and overall
inn the preprocessing Phase.
TECHNOLOGIES USED
6
Pygal is used for implementing the
visualizations using Support Vector
Graphics
1
4 Pygal
Numpy is used for computing the
numeracal Operations.
5 Numpy5
Pandas is used for Implementing all the
DataFrame processing.
6 Pandas

Automation Of Classification Algorithms
7
For the Automation Process I have used 6 Classification Algorithms and each Algorithm
is feed into the K-Fold Cross Validation into 10 Splits.
Accuracy ,Recall,F1 Score
Distribution of Classification
Algorithms which can help to
choose proper classifiers in less
amount of Time.
K-Fold
Cross
Validation
(10 splits)

Result Analysis On
Various Datasets Dataset : titanic_train.csv
Target Column : Survived
Split Amount: 0.3
Epoch Count: 100
8
79.01 80.36
73.63
80.26
62.86
82.27
0
10
20
30
40
50
60
70
80
90
Logistic
Regression
KNN Decision Tree Random
Forest
Naive Bayes SVM
Logistic Regression KNN Decision Tree Random Forest Naive Bayes SVM
79.01 80.36
73.63
80.26
62.86
82.27
0
10
20
30
40
50
60
70
80
90
Logistic
Regression
KNN Decision
Tree
Random
Forest
Naive
Bayes
SVM
Accuracies of Classifiers on titanic datset.
Logistic Regression KNN Decision Tree Random Forest Naive Bayes SVM
Dataset : brain_tumour_classification.csv
Target Column : diagnosis
Split Amount: 0.3
Epoch Count: 100

Auto-Visualization of
Feature Importance and Data details
The API is proved to analyze and visualize the feature-
Importances much more efficiently.
It is the Feature Importance Report of the titanic
Datset.
9
Dataset Info Table
Feature ColumnsImportances

Future Applications
of the API
10
Financial Analysis
and Bank Churn
Model
Business
Modeling
Health Care
Applications
Weather
Prediction

Further
Improvements
11
01
02
03
04
Integrating CNN and OpenCV so
that it can perform Image
Classification.
Image Processing and Disease
Recognition
Natural Language Processing.Text
Data Classification,Spam filter
Detection
Audio Generation &
Processing

CONCLUSION
Machine learning has become one of the main engines of the current era. The
production pipeline of a machine learning models passe through different phases
and stages that require wide knowledge of several available tools, and algorithms.
However, as the scale of data produced daily is increasing continuously at an
exponential scale, it has become essential to automate this process.
In this project, I have covered comprehensively the state-of-the-art research
effort in the domain of Driverless ML frameworks.
12

THANK
YOU!
Sayantan
Ghosh
Kalinga Institute Of Industrial
Technology
Email
gsayantan1999@gmail.com

Driverless Machine Learning Web App

More Related Content

What's hot (20)

Similar to Driverless Machine Learning Web App (20)

Recently uploaded (20)

Driverless Machine Learning Web App