0% found this document useful (0 votes)

17 views7 pages

Modelling and Simmulation Assignment - Ipynb - Colab

Student Droupout Prediction Using Decision Tree Classifier

Uploaded by

Muhammad Ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views7 pages

Modelling and Simmulation Assignment - Ipynb - Colab

Student Droupout Prediction Using Decision Tree Classifier

Uploaded by

Muhammad Ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

keyboard_arrow_down Step 1: Exploratory Data Analysis (EDA)

Let's begin by examining the dataset to understand its structure and the relationships between features and the target variable
(Dropout/Graduate).

path= '/content/drive/MyDrive/dataset.csv'

import pandas as pd

data= pd.read_csv(path)

data

Marital Application Application Daytime/evening Previous

Course Nacion
status mode order attendance qualification

0 1 8 5 2 1 1

1 1 6 1 11 1 1

2 1 1 5 5 1 1

3 1 8 2 15 1 1

4 2 12 1 3 0 1

... ... ... ... ... ... ...

4419 1 1 6 15 1 1

4420 1 1 2 15 1 1

4421 1 1 1 12 1 1

4422 1 1 1 9 1 1

4423 1 5 1 15 1 1

4424 rows × 35 columns

# Display summary statistics

data.describe()

Marital Application Application Daytime/evening Previou

Course
status mode order attendance qualificatio

count 4424.000000 4424.000000 4424.000000 4424.000000 4424.000000 4424.00000

mean 1.178571 6.886980 1.727848 9.899186 0.890823 2.53142

std 0.605747 5.298964 1.313793 4.331792 0.311897 3.96370

min 1.000000 1.000000 0.000000 1.000000 0.000000 1.00000

25% 1.000000 1.000000 1.000000 6.000000 1.000000 1.00000

50% 1.000000 8.000000 1.000000 10.000000 1.000000 1.00000

75% 1.000000 12.000000 2.000000 13.000000 1.000000 1.00000

max 6.000000 18.000000 9.000000 17.000000 1.000000 17.00000

8 rows × 34 columns

# Display data types of each column

data.dtypes
0

Marital status int64

Application mode int64

Application order int64

Course int64

Daytime/evening attendance int64

Previous qualification int64

Nacionality int64

Mother's qualification int64

Father's qualification int64

Mother's occupation int64

Father's occupation int64

Displaced int64

Educational special needs int64

Debtor int64

Tuition fees up to date int64

Gender int64

Scholarship holder int64

Age at enrollment int64

International int64

Curricular units 1st sem (credited) int64

Curricular units 1st sem (enrolled) int64

Curricular units 1st sem (evaluations) int64

Curricular units 1st sem (approved) int64

Curricular units 1st sem (grade) float64

Curricular units 1st sem (without evaluations) int64

Curricular units 2nd sem (credited) int64

Curricular units 2nd sem (enrolled) int64

Curricular units 2nd sem (evaluations) int64

Curricular units 2nd sem (approved) int64

Curricular units 2nd sem (grade) float64

Curricular units 2nd sem (without evaluations) int64

# Check for missing values

data.isnull().sum()
Application mode 0

Application order 0

Course 0

Daytime/evening attendance 0

Previous qualification 0

Nacionality 0

Mother's qualification 0

Father's qualification 0

Mother's occupation 0

Father's occupation 0

Displaced 0

Educational special needs 0

Debtor 0

Tuition fees up to date 0

Gender 0

Scholarship holder 0

Age at enrollment 0

International 0

Curricular units 1st sem (credited) 0

Curricular units 1st sem (enrolled) 0

Curricular units 1st sem (evaluations) 0

Curricular units 1st sem (approved) 0

Curricular units 1st sem (grade) 0

Curricular units 1st sem (without evaluations) 0

Curricular units 2nd sem (credited) 0

Curricular units 2nd sem (enrolled) 0

Curricular units 2nd sem (evaluations) 0

Curricular units 2nd sem (approved) 0

Curricular units 2nd sem (grade) 0

Curricular units 2nd sem (without evaluations) 0

Unemployment rate 0

Inflation rate 0

keyboard_arrow_down Step 2: Data Visualization

We will create various charts to visualize the data.

Scatter Plot

Let's create a scatter plot to see the relationship between the " Curricular units 2nd sem (grade) " and the " Target ".

import matplotlib.pyplot as plt

plt.scatter(data['Curricular units 2nd sem (grade)'], data['Target'])
plt.xlabel('Curricular units 2nd sem (grade)')
plt.ylabel('Target')
plt.title('Scatter Plot of Curricular units 2nd sem (grade) vs. Target'
plt show()

Bar Chart

Let's create a bar chart for the " Marital status " feature.

data['Marital status'].value_counts().plot(kind='bar')
plt.xlabel('Marital Status')
plt.ylabel('Count')
plt.title('Bar Chart of Marital Status')
plt.show()

Box Plot

Let's create a box plot for the " Curricular units 2nd sem (grade) " feature.
data.boxplot(column='Curricular units 2nd sem (grade)')
plt.title('Box Plot of Curricular units 2nd sem (grade)')
plt.show()

Histogram

Let's create a histogram for the " Curricular units 2nd sem (grade) " feature.

data['Curricular units 2nd sem (grade)'].hist()

plt.xlabel('Curricular units 2nd sem (grade)')
plt.ylabel('Frequency')
plt.title('Histogram of Curricular units 2nd sem (grade)')
plt.show()

keyboard_arrow_down Step 3: Data Preprocessing

We will preprocess the data, handling missing values, encoding categorical variables, and splitting the data into training and testing sets.

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import LabelEncoder

# Encode the target variable

label_encoder = LabelEncoder()
data['Target'] = label_encoder.fit_transform(data['Target'])

# Define the features (X) and the target (y)

X = data.drop('Target', axis=1)
y = data['Target']

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

keyboard_arrow_down Step 4: Model Building

We will build and train a decision tree model to predict student dropout rates.

from sklearn.tree import DecisionTreeClassifier

from sklearn.metrics import accuracy_score, classification_report

# Build and train the model

model = DecisionTreeClassifier()
model.fit(X_train, y_train)

# Make predictions
y_pred = model.predict(X_test)

# Evaluate the model

accuracy = accuracy_score(y_test, y_pred)
report = classification_report(y_test, y_pred)

print(f'Accuracy: {accuracy}')
print(f'Classification Report:\n{report}')

Accuracy: 0.6813559322033899
Classification Report:
precision recall f1-score support

0 0.77 0.66 0.71 316

1 0.34 0.39 0.36 151
2 0.76 0.81 0.78 418

accuracy 0.68 885

macro avg 0.62 0.62 0.62 885
weighted avg 0.69 0.68 0.68 885

def get_user_input_and_predict(model, feature_columns):

user_input = {}
for column in feature_columns:
user_input[column] = [input(f"Enter value for {column}: ")]

# Create a DataFrame for user inputs

input_df = pd.DataFrame(user_input)

# Handle any necessary preprocessing (e.g., converting to numeric)

for column in feature_columns:
if X[column].dtype in ['int64', 'float64']:
input_df[column] = pd.to_numeric(input_df[column])

# Predict using the trained model

prediction = model.predict(input_df)

# Decode the prediction

decoded_prediction = label_encoder.inverse_transform(prediction)

return decoded_prediction[0]

pred= model.predict(X_test)

# Dictionary for mapping encoded target values to original labels

target_mapping = {0: 'Dropout', 1: 'Enrolled', 2: 'Graduate'}
output= target_mapping[pred[0]]

original=target_mapping[y_pred[0]]

Comparing Values

print(f"Original Value: '{original}' and Predicted Value: '{output}'")

Original Value: 'Dropout' and Predicted Value: 'Dropout'

feature_columns = X.columns

# Predict on user inputs

predicted_class = get_user_input_and_predict(model, feature_columns)
predicted_class= target_mapping[predicted_class]
print(f"The predicted class is: {predicted_class}")

Enter value for Marital status: 1

Enter value for Application mode: 8
Enter value for Application order: 5
Enter value for Course: 2
Enter value for Daytime/evening attendance: 1
Enter value for Previous qualification: 1
Enter value for Nacionality: 1
Enter value for Mother's qualification: 1
Enter value for Father's qualification: 10
Enter value for Mother's occupation: 6
Enter value for Father's occupation: 10
Enter value for Displaced: 1
Enter value for Educational special needs: 0
Enter value for Debtor: 0
Enter value for Tuition fees up to date: 1
Enter value for Gender: 1
Enter value for Scholarship holder: 0
Enter value for Age at enrollment: 20
Enter value for International: 0
Enter value for Curricular units 1st sem (credited): 0
Enter value for Curricular units 1st sem (enrolled): 0
Enter value for Curricular units 1st sem (evaluations): 0
Enter value for Curricular units 1st sem (approved): 0
Enter value for Curricular units 1st sem (grade): 0
Enter value for Curricular units 1st sem (without evaluations): 0
Enter value for Curricular units 2nd sem (credited): 0
Enter value for Curricular units 2nd sem (enrolled): 0
Enter value for Curricular units 2nd sem (evaluations): 0
Enter value for Curricular units 2nd sem (approved): 0
Enter value for Curricular units 2nd sem (grade): 0
Enter value for Curricular units 2nd sem (without evaluations): 0
Enter value for Unemployment rate: 10.8
Enter value for Inflation rate: 1.4
Enter value for GDP: 1.74
The predicted class is: Dropout

Manual PRC PRI7000 V1.7.2.0 WP
No ratings yet
Manual PRC PRI7000 V1.7.2.0 WP
173 pages
Vijaya ML
88% (8)
Vijaya ML
26 pages
Encoder EUCHNER ABSOLUTO PDF
77% (13)
Encoder EUCHNER ABSOLUTO PDF
80 pages
SOURCE CODE (1)
No ratings yet
SOURCE CODE (1)
20 pages
DA LAB MANNUAL
No ratings yet
DA LAB MANNUAL
25 pages
ML Lab Programs For Exam
No ratings yet
ML Lab Programs For Exam
10 pages
Data Mining Presentation
No ratings yet
Data Mining Presentation
13 pages
Student Performance Analysis
No ratings yet
Student Performance Analysis
28 pages
Student Performance Analysis and Prediction
No ratings yet
Student Performance Analysis and Prediction
19 pages
Building Logistic regression model in python
No ratings yet
Building Logistic regression model in python
24 pages
Documentation - Ishaan Mittal - Jio - Assessment
No ratings yet
Documentation - Ishaan Mittal - Jio - Assessment
9 pages
featureselection
No ratings yet
featureselection
11 pages
Coding Notes Data Science
No ratings yet
Coding Notes Data Science
4 pages
Student Performance Analysis
No ratings yet
Student Performance Analysis
28 pages
Project Report
100% (3)
Project Report
36 pages
Machine File
No ratings yet
Machine File
27 pages
Project paarth (1) (1)
No ratings yet
Project paarth (1) (1)
21 pages
Documentation
No ratings yet
Documentation
7 pages
Project - Machine Learning-Business Report: By: K Ravi Kumar PGP-Data Science and Business Analytics (PGPDSBA.O.MAR23.A)
No ratings yet
Project - Machine Learning-Business Report: By: K Ravi Kumar PGP-Data Science and Business Analytics (PGPDSBA.O.MAR23.A)
38 pages
Project Synopsis of Student Droupout Prediction
No ratings yet
Project Synopsis of Student Droupout Prediction
6 pages
Student Performance Prediction Report
No ratings yet
Student Performance Prediction Report
9 pages
Day-4 DS Practicals
No ratings yet
Day-4 DS Practicals
5 pages
StarterNotebook - Jupyter Notebook
No ratings yet
StarterNotebook - Jupyter Notebook
12 pages
Spark Python Course APPLY Project Solution Guide Hints
No ratings yet
Spark Python Course APPLY Project Solution Guide Hints
2 pages
22BCE7750 ML Assignment
No ratings yet
22BCE7750 ML Assignment
23 pages
Articles Xgboost Classification With Smote-Enn Algorithm
No ratings yet
Articles Xgboost Classification With Smote-Enn Algorithm
11 pages
MACHINE LEARNING manual
No ratings yet
MACHINE LEARNING manual
36 pages
Assignment 2 Oops
No ratings yet
Assignment 2 Oops
10 pages
4.-Student Dropout Prediction 2020
No ratings yet
4.-Student Dropout Prediction 2020
12 pages
C121 Exp1
No ratings yet
C121 Exp1
32 pages
Activity 01: Python Set/s of Source Code Use in The Activity (Paste Below)
No ratings yet
Activity 01: Python Set/s of Source Code Use in The Activity (Paste Below)
6 pages
About The Dataset - Car Evaluation Dataset (UCI Machine Learning Repository
No ratings yet
About The Dataset - Car Evaluation Dataset (UCI Machine Learning Repository
5 pages
A Minor Project Report On DMT
No ratings yet
A Minor Project Report On DMT
11 pages
MACHINE LEARNING PROJECT
No ratings yet
MACHINE LEARNING PROJECT
29 pages
C121 Exp2
No ratings yet
C121 Exp2
23 pages
ML Report
No ratings yet
ML Report
20 pages
DataAnalytics Lab Manual (1)
No ratings yet
DataAnalytics Lab Manual (1)
35 pages
Personalized Learning PPt
No ratings yet
Personalized Learning PPt
13 pages
ML_Manual
No ratings yet
ML_Manual
18 pages
ML Practical 205160694034
No ratings yet
ML Practical 205160694034
33 pages
ML Complete Notes Hridoy.docx
No ratings yet
ML Complete Notes Hridoy.docx
5 pages
Exp 5
No ratings yet
Exp 5
4 pages
Academic Analytics Using Machine Learning
No ratings yet
Academic Analytics Using Machine Learning
26 pages
Lab 13
No ratings yet
Lab 13
5 pages
Titanic Dataset Model Prediction
No ratings yet
Titanic Dataset Model Prediction
11 pages
Credit_Card_Approval_Prediction_Report-Final
No ratings yet
Credit_Card_Approval_Prediction_Report-Final
27 pages
Assignment 2: Hive
No ratings yet
Assignment 2: Hive
11 pages
Data Visualization EDA-print
No ratings yet
Data Visualization EDA-print
18 pages
Student Performance Analysis Using Machine Learning: Yamnampet, Hyderabad.
No ratings yet
Student Performance Analysis Using Machine Learning: Yamnampet, Hyderabad.
8 pages
1st PGM
No ratings yet
1st PGM
10 pages
Case study-ML-SI No 2
No ratings yet
Case study-ML-SI No 2
13 pages
Pattern Recognition
No ratings yet
Pattern Recognition
26 pages
AIML
No ratings yet
AIML
12 pages
dsbda_5
No ratings yet
dsbda_5
4 pages
vertopal.com_Final007
No ratings yet
vertopal.com_Final007
35 pages
ml_all_projectpdf_removed
No ratings yet
ml_all_projectpdf_removed
41 pages
Home Work
No ratings yet
Home Work
12 pages
Assignment (4)
No ratings yet
Assignment (4)
5 pages
ML Report (Final)
No ratings yet
ML Report (Final)
20 pages
Task1
No ratings yet
Task1
5 pages
Machine Learning Lab New
No ratings yet
Machine Learning Lab New
14 pages
TouchCode Class 7: Coding Book
From Everand
TouchCode Class 7: Coding Book
Team Orange
No ratings yet
Concat, Join, Merge in Pandas
No ratings yet
Concat, Join, Merge in Pandas
17 pages
It Application Tools in Business: Getting Started
No ratings yet
It Application Tools in Business: Getting Started
33 pages
Drawing CHECKING Basics
No ratings yet
Drawing CHECKING Basics
4 pages
Ningalude Sneham AJ Joseph
No ratings yet
Ningalude Sneham AJ Joseph
8 pages
2 Interpretability of Hybrid Feature Using Graph Neural Networks From Mental Arithmetic Based EEG
No ratings yet
2 Interpretability of Hybrid Feature Using Graph Neural Networks From Mental Arithmetic Based EEG
5 pages
HARGA MW HARGA TI JABAR WJ - Garut
No ratings yet
HARGA MW HARGA TI JABAR WJ - Garut
1 page
OPTIKA C-HP4 Technical Datasheet
No ratings yet
OPTIKA C-HP4 Technical Datasheet
3 pages
Ibm Infosphere
No ratings yet
Ibm Infosphere
8 pages
Chapter3 - Defining The Roles of The Project Manager and The Theam
100% (1)
Chapter3 - Defining The Roles of The Project Manager and The Theam
10 pages
370MP SN W01131197 Documentation
No ratings yet
370MP SN W01131197 Documentation
84 pages
Logical Reasoning Syllabus
No ratings yet
Logical Reasoning Syllabus
2 pages
6.6 Function Operations
No ratings yet
6.6 Function Operations
16 pages
F-16 Limit Cycle Oscillation
No ratings yet
F-16 Limit Cycle Oscillation
11 pages
Altera Flex 10
No ratings yet
Altera Flex 10
128 pages
Mamiya Lens Test
100% (1)
Mamiya Lens Test
11 pages
FineReader Licensing
No ratings yet
FineReader Licensing
29 pages
Wireless Local Loop
No ratings yet
Wireless Local Loop
31 pages
A3 Project Management and Problem Solving Thinking 1. What Is An A3 Project?
No ratings yet
A3 Project Management and Problem Solving Thinking 1. What Is An A3 Project?
7 pages
10 Tools To Make A Bootable USB From An ISO File
No ratings yet
10 Tools To Make A Bootable USB From An ISO File
10 pages
8085 Microprocessor - Functional Units: Accumulator
No ratings yet
8085 Microprocessor - Functional Units: Accumulator
5 pages
Timing Closure
No ratings yet
Timing Closure
73 pages
Statistical Modeling: The Two Cultures: Leo Breiman
No ratings yet
Statistical Modeling: The Two Cultures: Leo Breiman
17 pages
A Work Breakdown Structure For Implementing and Costing An ERP Project
No ratings yet
A Work Breakdown Structure For Implementing and Costing An ERP Project
10 pages
24ESGE102- Engineering Practices Laboratory ECE
No ratings yet
24ESGE102- Engineering Practices Laboratory ECE
36 pages
4-802 15ZigBee
No ratings yet
4-802 15ZigBee
18 pages
Walchand Institute of Technology, Solapur: Direct Linking Loaders
No ratings yet
Walchand Institute of Technology, Solapur: Direct Linking Loaders
14 pages
Printing Configurations: GX-9100 Software Configuration Tool User's Guide D-1
No ratings yet
Printing Configurations: GX-9100 Software Configuration Tool User's Guide D-1
7 pages
MAT235: Discussion 3: 1 Convolution
No ratings yet
MAT235: Discussion 3: 1 Convolution
2 pages