0% found this document useful (0 votes)
3 views8 pages

Ens Embling

The document discusses ensemble learning techniques in machine learning, focusing on Bagging and Boosting methods. Bagging involves training multiple models independently on subsets of data to improve prediction accuracy, while Boosting trains models sequentially to correct errors from previous models. Both methods aim to enhance model performance, reduce overfitting, and increase robustness.

Uploaded by

gokulk200507
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views8 pages

Ens Embling

The document discusses ensemble learning techniques in machine learning, focusing on Bagging and Boosting methods. Bagging involves training multiple models independently on subsets of data to improve prediction accuracy, while Boosting trains models sequentially to correct errors from previous models. Both methods aim to enhance model performance, reduce overfitting, and increase robustness.

Uploaded by

gokulk200507
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 8

UNIT-4

ENSEMBLE TECHNIQUE &


UNSUPERVISED LEARNING
Ensemble Learning

• Ensemble learning combines the predictions of multiple models (called


"weak learners" or "base models") to make a stronger, more reliable
prediction. The goal is to reduce errors and improve performance.

Types of Ensemble Learning in Machine Learning:


There are two main types of ensemble methods:
• Bagging (Bootstrap Aggregating): Models are trained independently on
different subsets of the data, and their results are averaged or voted on.
• Boosting: Models are trained sequentially, with each one learning from
the mistakes of the previous model.
(E.g): Think of it like asking multiple doctors for a diagnosis (bagging) or
consulting doctors who specialize in correcting previous misdiagnoses
(boosting).
1. Bagging Algorithm

• Bagging Classifier can be used for both regression and classification tasks. Here is an
overview of Bagging classifier algorithm:
• Bootstrap Sampling: Divides the original training data into ‘N’ subsets and randomly
selects a subset with replacement in some rows from other subsets. This step ensures
that the base models are trained on diverse subsets of the data and there is no class
imbalance.
• Base Model Training: For each bootstrapped sample we train a base model
independently on that subset of data. These weak models are trained in parallel to
increase computational efficiency and reduce time consumption. We can use
different base learners i.e different ML models as base learners to bring variety and
robustness.
• Prediction Aggregation: To make a prediction on testing data combine the predictions
of all base models. For classification tasks it can include majority voting or weighted
majority while for regression it involves averaging the predictions.
• Out-of-Bag (OOB) Evaluation:. Some samples are excluded from the training subset of
particular base models during the bootstrapping method.These “out-of-bag” samples
can be used to estimate the model’s performance without the need for cross-
validation.
• Final Prediction: After aggregating the predictions from all the base models, Bagging
produces a final prediction for each instance.
Python pseudo code for Bagging Estimator implementing libraries:

1. Importing Libraries and Loading Data


from sklearn.ensemble import BaggingClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
2. Loading and Splitting the Iris Dataset
data = load_iris()
X = data.data
y = data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
3. Creating a Base Classifier
base_classifier = DecisionTreeClassifier()
4. Creating and Training the Bagging Classifier
bagging_classifier = BaggingClassifier(base_classifier, n_estimators=10,
random_state=42)
bagging_classifier.fit(X_train, y_train)
5. Making Predictions and Evaluating Accuracy
y_pred = bagging_classifier.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
Output:
Accuracy: 1.0
Benefits of Ensemble Learning in Machine Learning
• Reduction in Overfitting
• Improved Generalization
• Increased Accuracy
• Robustness to Noise
• Flexibility
2. Boosting Algorithm

Boosting is an ensemble technique that combines multiple weak learners to


create a strong learner. Weak models are trained in series such that each next
model tries to correct errors of the previous model until the entire training
dataset is predicted correctly. One of the most well-known boosting
algorithms is AdaBoost (Adaptive Boosting). Here is an overview of Boosting
algorithm:

• Initialize Model Weights: Begin with a single weak learner and assign equal
weights to all training examples.
• Train Weak Learner: Train weak learners on these dataset.
• Sequential Learning: Boosting works by training models sequentially where
each model focuses on correcting the errors of its predecessor. Boosting
typically uses a single type of weak learner like decision trees.
• Weight Adjustment: Boosting assigns weights to training datapoints.
Misclassified examples receive higher weights in the next iteration so that next
models pay more attention to them.
Python pseudo code for boosting Estimator implementing libraries:

1. Importing Libraries and Modules


from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
2. Loading and Splitting the Dataset
data = load_iris()
X = data.data
y = data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
3. Defining the Weak Learner
base_classifier = DecisionTreeClassifier(max_depth=1)
4. Creating and Training the AdaBoost Classifier
adaboost_classifier = AdaBoostClassifier(base_classifier, n_estimators=50,
learning_rate=1.0, random_state=42)
adaboost_classifier.fit(X_train, y_train)
5. Making Predictions and Calculating Accuracy
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

Output:
Accuracy: 1.0

You might also like