0% found this document useful (0 votes)

77 views8 pages

Phishing Detection Classifier Lab Guide

This document describes a lab assignment to build classifiers for detecting phishing websites. Students are asked to implement classifiers using the WEKA machine learning software, scikit-learn in Python, and a neural network using TensorFlow in Python. The dataset contains features for over 100,000 web hits along with labels indicating phishing or not. Students will preprocess the data, train classifiers, and report accuracy and other metrics for logistic regression, decision trees, Naive Bayes, random forests, and a neural network. Code snippets and results are to be submitted in a lab report.

Uploaded by

ouafa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

77 views8 pages

Phishing Detection Classifier Lab Guide

Uploaded by

ouafa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Intelligent Systems for Cybersecurity

MACHINE LEARNING CYBERSECURITY

PHISHING DETECTION

LAB 2: WRITING A CLASSIFIER FOR PHISHING DATASET

Lab Description: This lab is to write the python script as well as use
WEKA to implement a binary classifier to estimate whether a website is a
phishing website. The dataset contains 102816 web hits and 30 features were
recorded for each of the hit. Also, a class value has been given for each of the
record.

Example of phishing dataset:

Features Description:
Intelligent Systems for Cybersecurity

Pr. Meryeme Ayache Page | 2

Intelligent Systems for Cybersecurity

You are required to implement it in three ways:

 Using the machine learning software WEKA.

 Writing a python script with the use of the package sklearn

 Writing a python script with the use of the package tensorflow and
deep learning techniques.

Lab Environment: The student should have access to no matter a

machine with Linux system or Windows system, but the environment for
python is required as well as some packages such as numpy, tensorflow
and sklearn.

Lab Files that are Needed: For this lab you will need one file
(phishing_1.csv) the last column is the class value, others are the features.

LAB EXERCISE 1
 Import data into WEKA (explorer), the files of type should be specified (csv).

 Choose a proper classifier, such as RandomForest

Pr. Meryeme Ayache Page | 3

Intelligent Systems for Cybersecurity

 Specify the test option and the column of class

LAB EXERCISE 2
 In this exercise, you need to implement several classifiers with the use
of sklearn.
 Import sklearn code and required libraries

 Read the features and class values from malware dataset with proper
method

 Phishing_1.csv is the name of the file.

 delimiter indicates the character to split the data in a row.
 usecols indicates which columns will be read. For features, the
columns from 1 to 30 will be read. For class values, the first columns
of the rows will be read.
 dtype indicates the type of data to read

Pr. Meryeme Ayache Page | 4

Intelligent Systems for Cybersecurity

 Since the first line of the file is names for each column,
we set skip_header to 1 to avoid read the first row.
 Split the dataset. When you finish the preprocess step, you can write
the python script with the use of sklearn package to build your
architecture of classifier.

 random_state is the seed used by the random number generator

 This is for the decision tree:

 Please print the statistics metrics such as accuracy, recall, precision and
f1 score.

 Implement the classifiers based on Logistic Regression, Decision Tree,

Naïve Bayes and Random Forest

LAB EXERCISE 3
 Use the same data you use in the exercise 1 and 2.
 In this exercise, you will implement an artificial neural network classifier
based on Tensorflow

Pr. Meryeme Ayache Page | 5

Intelligent Systems for Cybersecurity

 Import the required libraries

 Repeat the same steps to preprocess the data as Exercise 2. Read the
data, standard scale the feature and encode the labels.
 Define the learning rate and number of epochs for artificial neural
network

 An extra step in preprocess is to perform the one-hot encoding for the

labels.

 Split the dataset after preprocessing and define the parameters to store
the shape of placeholder.

 Define the function to draw the plot of performance

Pr. Meryeme Ayache Page | 6

Intelligent Systems for Cybersecurity

 Define your own architecture of neural network

 Please print the statistics metrics such as accuracy, recall, precision and
f1 score.

Pr. Meryeme Ayache Page | 7

Intelligent Systems for Cybersecurity

 Initialize the variables and placeholders. Then perform

the training and testing on dataset.

WHAT TO SUBMIT

You should submit a lab report file which include the steps you
preprocessed data, the necessary code snippet of your classifier and
architecture. Also, the screenshot for both your code snippet and the result
are needed. You can call your file "Lab2_phishing_yourname.doc".

Pr. Meryeme Ayache Page | 8

MLC Malware Lab
No ratings yet
MLC Malware Lab
8 pages
MLC IntrusionDetection KDD Lab
No ratings yet
MLC IntrusionDetection KDD Lab
5 pages
IA Lab Manual Final
No ratings yet
IA Lab Manual Final
40 pages
ML Assignment 1
No ratings yet
ML Assignment 1
15 pages
Phishing Website Detection System Project
No ratings yet
Phishing Website Detection System Project
18 pages
Practical Labs Guide
No ratings yet
Practical Labs Guide
34 pages
Phishing Detection Capstone
No ratings yet
Phishing Detection Capstone
19 pages
Malware Detection Lab Experiment
No ratings yet
Malware Detection Lab Experiment
11 pages
Phisingppt
No ratings yet
Phisingppt
15 pages
Information Security Project
No ratings yet
Information Security Project
7 pages
AI & ML Assignment Guidelines 2024
No ratings yet
AI & ML Assignment Guidelines 2024
3 pages
ML Lab Question Set - 1
No ratings yet
ML Lab Question Set - 1
5 pages
Tushar ML
No ratings yet
Tushar ML
52 pages
Attiq Ahmad Afsar Assignment 1
No ratings yet
Attiq Ahmad Afsar Assignment 1
12 pages
MMAKR
No ratings yet
MMAKR
13 pages
Important Questions
No ratings yet
Important Questions
4 pages
AI Exercises: Data Science & ML in Python
No ratings yet
AI Exercises: Data Science & ML in Python
21 pages
ML Lab Manual
No ratings yet
ML Lab Manual
14 pages
Questions
No ratings yet
Questions
7 pages
Machine Learning Lab Manual Guide
No ratings yet
Machine Learning Lab Manual Guide
21 pages
Cyber Threat Detection Project Guide
No ratings yet
Cyber Threat Detection Project Guide
7 pages
Machine L-Lab-Manual
No ratings yet
Machine L-Lab-Manual
90 pages
Phishing Detection System Using ML
No ratings yet
Phishing Detection System Using ML
29 pages
ML Lab Syllabus for Students
No ratings yet
ML Lab Syllabus for Students
90 pages
Malware Analysis Using Python and Kaggle Dataset
No ratings yet
Malware Analysis Using Python and Kaggle Dataset
4 pages
Neural Networks for Regression & Classification
No ratings yet
Neural Networks for Regression & Classification
2 pages
ML Lab Question Set - 2
No ratings yet
ML Lab Question Set - 2
5 pages
Phishing PPT Final
No ratings yet
Phishing PPT Final
24 pages
hw1 Problem Set
No ratings yet
hw1 Problem Set
8 pages
ML Lab Manual
No ratings yet
ML Lab Manual
13 pages
Nostarch Wintersampler 2018 Ebook
No ratings yet
Nostarch Wintersampler 2018 Ebook
40 pages
Phishing Detection via ML Project
No ratings yet
Phishing Detection via ML Project
17 pages
PHP Internship Report: Machine Learning
No ratings yet
PHP Internship Report: Machine Learning
25 pages
HW46
No ratings yet
HW46
5 pages
ML Lab
No ratings yet
ML Lab
45 pages
Explore UCI & Kaggle Datasets with ML Tools
No ratings yet
Explore UCI & Kaggle Datasets with ML Tools
56 pages
Decision Tree Classification with Scikit-Learn
No ratings yet
Decision Tree Classification with Scikit-Learn
33 pages
Ad8552 ML Unit V
No ratings yet
Ad8552 ML Unit V
78 pages
Adversarial Autoencoder Data Synthesis For Enhancing Machine Learning-Based Phishing Detection Algorit
No ratings yet
Adversarial Autoencoder Data Synthesis For Enhancing Machine Learning-Based Phishing Detection Algorit
13 pages
Machine Learning for Malware Detection
No ratings yet
Machine Learning for Malware Detection
38 pages
Project
No ratings yet
Project
3 pages
AI Course Experiments Certificate
No ratings yet
AI Course Experiments Certificate
69 pages
MLC Nbad-8
No ratings yet
MLC Nbad-8
4 pages
Phishing Detection with Machine Learning
No ratings yet
Phishing Detection with Machine Learning
28 pages
Final Synposis
No ratings yet
Final Synposis
10 pages
Phishing Website Detection with ML
No ratings yet
Phishing Website Detection with ML
16 pages
AI and ML Laboratory
No ratings yet
AI and ML Laboratory
12 pages
Artificial Intellegence Lab Practical
No ratings yet
Artificial Intellegence Lab Practical
48 pages
ML Lab Manual
No ratings yet
ML Lab Manual
38 pages
Module 5.pptx - 20250608 - 201231 - 0000
No ratings yet
Module 5.pptx - 20250608 - 201231 - 0000
43 pages
Machine Learning Laboratory Report
No ratings yet
Machine Learning Laboratory Report
23 pages
M.Tech Machine Learning Exam Questions
No ratings yet
M.Tech Machine Learning Exam Questions
4 pages
Lab Manual
No ratings yet
Lab Manual
80 pages
Lab 8
No ratings yet
Lab 8
5 pages
AI & Machine Learning Lab Course Guide
No ratings yet
AI & Machine Learning Lab Course Guide
6 pages
ML Lab Manual1
No ratings yet
ML Lab Manual1
23 pages
Deep Learningexp4
No ratings yet
Deep Learningexp4
4 pages
PhishDetectAI Project Analysis
No ratings yet
PhishDetectAI Project Analysis
9 pages
Advanced Sentiment Analysis From Lexicon-Enhanced BERT To Dimensionality Reduction Using NLP
No ratings yet
Advanced Sentiment Analysis From Lexicon-Enhanced BERT To Dimensionality Reduction Using NLP
6 pages
Introduction to Data Science Overview
No ratings yet
Introduction to Data Science Overview
16 pages
Outfitx: A Deep Learning Framework For Personalized Outfit Recommendations
No ratings yet
Outfitx: A Deep Learning Framework For Personalized Outfit Recommendations
6 pages
3D Contrastive Learning for Experts
No ratings yet
3D Contrastive Learning for Experts
11 pages
GPT Self-Supervised Data Annotation Method
No ratings yet
GPT Self-Supervised Data Annotation Method
15 pages
Predicting Refrigerant in Air Conditioners
No ratings yet
Predicting Refrigerant in Air Conditioners
9 pages
Metrics for Machine Learning Models
No ratings yet
Metrics for Machine Learning Models
20 pages
Machine Learning Internship Report 2023
No ratings yet
Machine Learning Internship Report 2023
39 pages
Manifold Learning For LLM Compression
No ratings yet
Manifold Learning For LLM Compression
4 pages
Final Report Submission - Ameya, Ananya
No ratings yet
Final Report Submission - Ameya, Ananya
37 pages
Stock Market Prediction Techniques
No ratings yet
Stock Market Prediction Techniques
6 pages
Machine Learning Model for Autism Prediction
No ratings yet
Machine Learning Model for Autism Prediction
6 pages
Robust Image Forgery Detection Against Transmission Over Online Social Networks
100% (1)
Robust Image Forgery Detection Against Transmission Over Online Social Networks
14 pages
Plant Leaf Disease Detection AI
No ratings yet
Plant Leaf Disease Detection AI
27 pages
Is Naive Bayes A Good Classifier For Document Clas
No ratings yet
Is Naive Bayes A Good Classifier For Document Clas
11 pages
Machine Learning For IoT-based Smart Farming
No ratings yet
Machine Learning For IoT-based Smart Farming
5 pages
Project Report
No ratings yet
Project Report
42 pages
Sample Questions For Oracle 1z0 1122 24 Exam by Schroeder
No ratings yet
Sample Questions For Oracle 1z0 1122 24 Exam by Schroeder
12 pages
AI for Plant Disease Diagnosis Report
No ratings yet
AI for Plant Disease Diagnosis Report
38 pages
Top 170 Machine Learning Interview Q&As
No ratings yet
Top 170 Machine Learning Interview Q&As
67 pages
Crop Disease
No ratings yet
Crop Disease
17 pages
RUL Prediction For Lithium-Ion Batteries Using Combination Forecasting Based On SVR and LSTM
No ratings yet
RUL Prediction For Lithium-Ion Batteries Using Combination Forecasting Based On SVR and LSTM
5 pages
Detection of Partially Occluded Area in Face Image Using U-Net Model
No ratings yet
Detection of Partially Occluded Area in Face Image Using U-Net Model
7 pages
White Paper Artificial Intelligence in Engineering and Construction
No ratings yet
White Paper Artificial Intelligence in Engineering and Construction
32 pages
Set 2
No ratings yet
Set 2
19 pages
Kpaper EXIST2025
No ratings yet
Kpaper EXIST2025
7 pages
Project Report
No ratings yet
Project Report
16 pages
Aaaaaaaaaaaaa
No ratings yet
Aaaaaaaaaaaaa
41 pages
Machine Learning Model Evaluation & Tuning
No ratings yet
Machine Learning Model Evaluation & Tuning
58 pages
Kneser-Ney Smoothing in NLP
No ratings yet
Kneser-Ney Smoothing in NLP
10 pages