0% found this document useful (0 votes)
16 views4 pages

Exp 3 121a1047 Lavanya Kurup ML

Uploaded by

Punya Nair
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views4 pages

Exp 3 121a1047 Lavanya Kurup ML

Uploaded by

Punya Nair
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Name: LAVANYA KURUP PRN: 121A1047 Div: C Batch: C3

MACHINE LEARNING
EXP 3: Implement Decision Tree Classifier in Python
Aim: To implement decision tree classifier in python.

Theory:
Decision trees are a type of machine learning model used for both classification and regression
tasks. They work by splitting data into subsets based on the value of input features, forming a tree-
like structure.

Here’s a breakdown of how they work:

1. Structure:
 Nodes: Each node represents a decision or a test on an attribute (feature). In a
classification tree, nodes might test whether a feature is greater than a certain value.
In a regression tree, nodes might split the data based on continuous values.
 Branches: The branches represent the outcome of the test, leading to different nodes
or leaves.
 Leaves: The terminal nodes (leaves) provide the output or prediction. In classification,
they give the class label; in regression, they provide a numerical value.

2. Construction:
 Splitting: The tree is constructed by recursively splitting the dataset based on the
feature that results in the best separation of the data. Common criteria for splitting
include Gini impurity, entropy (for classification), or mean squared error (for
regression).
 Pruning: To avoid overfitting, decision trees are often pruned. This involves removing
branches that have little importance or do not contribute significantly to the model’s
predictive power.

3. Advantages:
 Interpretable: Decision trees are easy to understand and visualize, as they mimic
human decision-making.
 No Feature Scaling Required: They do not require normalization or scaling of
features.
 Versatile: Can be used for both classification and regression tasks.
4. Disadvantages:
 Overfitting: Decision trees can easily overfit the training data, especially if they are
too deep.
 Instability: Small changes in the data can result in a completely different tree
structure.

Decision trees can be combined into ensemble methods like Random Forests or Gradient Boosting
Machines to improve performance and robustness.

Program Code:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
df = pd.read_csv('diabetes.csv')
df.head()
X = df.drop('Outcome', axis=1)
y = df['Outcome']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

from sklearn.tree import DecisionTreeClassifier


from sklearn.metrics import classification_report, accuracy_score
model = DecisionTreeClassifier(random_state=42)

# Fit the model to the training data


model.fit(X_train, y_train)

# Now you can predict using the trained model


y_pred = model.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)


report = classification_report(y_test, y_pred)

print(f'Accuracy: {accuracy}')
print(f'Classification Report:\n{report}')

from sklearn.tree import plot_tree


import matplotlib.pyplot as plt

plt.figure(figsize=(20,10))
plot_tree(model, feature_names=X.columns, class_names=['No Diabetes', 'Diabetes'], filled=True)
plt.show()

OUTPUT:
1] Upload csv and display it.

2] Display performance metrics of the algorithm


3] Plot the Decision Tree as final output

Conclusion
Thus, in this experiment, I implemented the decision tree classifier for diabetes dataset in python.

You might also like