0% found this document useful (0 votes)

2 views43 pages

Project Report 4th Year

This project involves creating a neural network from scratch using NumPy to understand the mathematical principles behind machine learning, particularly backpropagation. It includes implementing core components, testing on various datasets, and visualizing the learning process through detailed plots. The project aims to bridge the gap left by high-level libraries, enhancing comprehension of neural network mechanics and training dynamics.

Uploaded by

varshiniyara785

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views43 pages

Project Report 4th Year

Uploaded by

varshiniyara785

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

ABSTRACT

This project focuses on building a neural network entirely from scratch using NumPy, with the
primary aim of gaining a deep understanding of the mathematical principles and algorithms that
power modern machine learning models—most notably, the backpropagation algorithm. Unlike
popular machine learning libraries such as TensorFlow, PyTorch, or Keras that abstract away
much of the underlying complexity, this project exposes and implements the inner workings of
neural networks manually. The network includes key components such as forward and backward
passes, gradient-based optimization, and parameter updates, all of which were constructed
without the use of high-level machine learning APIs. Through this hands-on approach, we
demystify how neural networks actually "learn" from data.

To evaluate and visualize the learning process, the implemented model was tested on both simple
(logic gate) and more complex datasets obtained from Kaggle, covering both classification and
regression tasks. A wide variety of plots were generated—including weight trajectories, gradient
updates, accuracy/R² scores, and contour plots of the loss landscape—to illustrate the behavior
and dynamics of training. Additionally, Principal Component Analysis (PCA) was implemented
from scratch to visualize changes in the model’s output across epochs. This comprehensive
exploration not only validates the effectiveness of our custom-built neural network but also
strengthens our conceptual and practical grasp of foundational machine learning techniques.

II
Table of Contents

Page No.
1 Introduction
1.1 Problem statement 1
1.2 Motivation 1
1.3 Objectives 2
2 Literature Survey
2.1 Existing work 3
2.2 Limitations of Existing work 4
3 Software Requirements Specifications
3.1 Overall Description 6
3.2 Operating Environment 6
3.3 Functional Requirements 7
3.4 Non-Functional Requirements 7-8
4 Design
4.1 Use Case Diagram 9
4.2 Class Diagram 10
4.3 Sequence Diagram 11
4.4 Data Flow Diagram 12
4.5 System Architecture 13
5 Implementation
5.1 Sample Code 14-26
6 Testing
6.1 Test Cases 27-28
7 Screenshots
7.1 AND gate dataset – Classification Task 29-31
7.2 Kaggle dataset – Classification Task 32-34
7.3 Kaggle dataset – Regression Task 34-36
7.4 Project Structure 37
8 Conclusion & Future scope 38-39
References 40-41

III
1. INTRODUCTION

1.1 Problem Statement

In the domain of machine learning, neural networks are widely used for solving complex tasks
such as classification, regression, and pattern recognition. While powerful, the process by which
these networks learn from data involves intricate mathematical operations, particularly in the
context of training through optimization algorithms like backpropagation. Most widely-used
frameworks abstract these details, offering simplified interfaces that conceal the underlying
computations.

This abstraction creates a gap in understanding the fundamental mechanics of neural networks,
especially for learners and practitioners aiming to grasp how learning actually occurs at the level
of individual operations—such as computing gradients, updating weights, and propagating
errors. Without direct exposure to these internal processes, it becomes challenging to build a
strong foundational intuition about model behavior, performance, and limitations.

1.2 Motivation
The primary motivation behind this project is to develop a deeper, hands-on understanding of
how neural networks operate at a mathematical and algorithmic level—beyond the abstraction
offered by modern libraries. By implementing each component manually, especially the
backpropagation algorithm, we aimed to bridge the conceptual gap left by high-level tools. This
exercise not only reinforces our theoretical knowledge but also enhances our ability to diagnose,
interpret, and optimize models in practical machine learning tasks by understanding what
happens beneath the surface during training.

1
1.3 Objectives

The goal of this project is to break away from high-level abstractions and understand, at a
granular level, how neural networks learn from data. To achieve this, we defined the following
specific objectives:

● To implement a feedforward neural network from scratch using only NumPy, covering
core components such as forward pass, backward pass, and parameter updates.

● To study and manually apply the mathematical foundations behind the backpropagation
algorithm, including calculus-based gradient derivations.

● To visualize the internal behavior of the model during training using detailed plots of
gradients, weights, scores, and loss landscapes.

● To test the neural network on both simple datasets (e.g., logic gates) and more complex
datasets, across classification and regression tasks.

2
2. LITERATURE SURVEY

2.1 Existing Work

This section presents a review of recent research efforts that explore various applications and
advancements in neural networks. The emphasis is on how neural networks have been applied
across different problem domains and the practical implications of these approaches.

Mellah et al. [1] proposed a neural network-based estimator for brushed DC machines capable
of simultaneously predicting speed, armature temperature, and resistance using only voltage and
current measurements. To enhance learning efficiency, the authors implemented a
Cascade-Forward Neural Network (CFNN) trained using the Resilient Backpropagation (RBP)
algorithm, known for its fast convergence and robustness. Unlike traditional estimators—which
typically target a single parameter and are often prone to instability and noise sensitivity—the
proposed method eliminates the need for physical speed and thermal sensors. Comparative
results demonstrated that the neural estimator closely matches model predictions and offers
improved estimation accuracy, making it suitable for thermal monitoring and high-performance
motor drive applications.

Bülte et al. [2] introduced a graph neural network (GNN)-based framework to post-process
ensemble precipitation forecasts with a focus on extreme weather events. Unlike traditional
post-processing methods that often overlook complex spatial dependencies and tail behaviors in
precipitation data, this approach directly targets extremes in the distribution. By leveraging the
structure of GNNs, the model captures spatial correlations and enhances the accuracy of
probabilistic forecasts, particularly for extreme precipitation. Experimental comparisons showed
improved performance over standard baselines, highlighting the framework’s potential in
reducing flood risks and informing climate resilience strategies. Future directions include
integrating this approach with end-to-end forecasting systems and refining extreme-value
modeling techniques.

Karabayir et al. [3] introduced a novel optimization approach called the Evolved Gradient
Direction Optimizer (EVGO), designed to address the vanishing gradient issue commonly
encountered in training deep neural networks (DNNs). The method leverages both first-order
gradients and a specially constructed hyperplane to guide weight updates. The authors
benchmarked EVGO against several established gradient-based optimizers, including Adam,
RMSProp, and gradient descent, across datasets such as MNIST, CIFAR-10, and CIFAR-100
using well-known architectures like AlexNet and ResNet. Their experiments showed that EVGO
consistently outperformed the other optimizers in terms of accuracy and convergence, even in
deeper or narrower networks.

3
Yang et al. [4] proposed a hybrid training strategy called GEMONN, which integrates gradient
information into evolutionary algorithms to improve the training of deep neural networks
(DNNs). The key innovation lies in a specialized genetic operator that guides the search using
gradient directions while also optimizing for network sparsity, helping reduce complexity and
prevent overfitting. Through experiments on various architectures — including autoencoders,
LSTMs, and CNNs — GEMONN demonstrated superior performance compared to both
traditional evolutionary methods and standard gradient-based optimizers like SGD and Adam.
Although it showed slightly lower performance on CNNs, the approach still offers strong
potential, especially for resource-constrained environments where sparse networks are beneficial.

Na et al. [5] addressed the vanishing gradient problem in deep neural networks (DNNs) by
integrating batch normalization (BN) layers before each sigmoid activation layer. Their approach
was specifically applied to the modeling of microwave components, a domain known for its
complex non-linear behavior. By normalizing layer inputs with additional scaling and shifting,
the BN layers improved gradient flow and training stability. Additionally, they employed an
Automated Model Generation (AMG) algorithm to dynamically configure the network
architecture, including the number of hidden and BN layers. This combination of BN and AMG
contributed to a more robust and efficient training process for deep networks in engineering
applications.

Li et al. [6] investigated pruning techniques for Binary Neural Networks (BNNs), which are
already compact due to their binary weights and activations. Recognizing that existing pruning
methods for full-precision networks are unsuitable for BNNs, the authors introduced a novel
pruning strategy based on weight flipping frequency — a measure of how often weights change
during training. This metric serves as an indicator of a weight's sensitivity to model accuracy.
Experiments on binary versions of AlexNet and a 9-layer Network-in-Network (NIN), using the
CIFAR-10 dataset, demonstrated that their method could reduce binary operations by 20–40%
with only minimal accuracy loss. The approach also achieved significant runtime improvements,
highlighting its effectiveness for optimizing BNNs without sacrificing performance.

2.2 Limitations of Existing Work

While these studies offer valuable applications and insights, they also reveal certain limitations
and open challenges that remain unaddressed:

1. Lack of Foundational Understanding in Existing Implementations

Most surveyed works rely heavily on pre-built neural network frameworks (e.g., TensorFlow,
Keras), which abstract away the underlying mathematical operations. This leads to a gap in
foundational comprehension of neural network internals, especially the backpropagation
algorithm and gradient flow mechanics.

4
2. Limited Exploration of Neural Network Construction from Scratch
Few studies focus on implementing neural networks from the ground up using only low-level
tools like NumPy. This leaves a gap in hands-on understanding of core components such as
forward pass, backpropagation, loss functions, and optimization routines.

3. Overemphasis on Application, Underemphasis on Intuition

The reviewed literature largely applies neural networks to specific tasks (e.g., forecasting,
classification, parameter tuning), prioritizing performance over interpretability. This creates a
gap in pedagogical or learning-oriented projects that emphasize why neural networks behave
the way they do.

4. High-Level Libraries Mask Model Behavior

Optimization algorithms like Adam or SGD are often used out-of-the-box. The internal
workings — like how gradients update weights or how learning rates impact convergence —
are not typically analyzed or re-implemented. This obscures a full understanding of model
training dynamics.

5
3. SOFTWARE REQUIREMENTS SPECIFICATION

3.1 Overall Description

This project implements a neural network system from scratch using NumPy. The system is
composed of modular components, including classes for building models, layers, optimizers,
training routines, and plotting tools. It supports classification and regression tasks, allows custom
dataset integration, and includes visualization capabilities to monitor various training metrics.
The intended users of this system are students, educators, or developers who want to study or
demonstrate the internal workings of neural networks in a transparent and configurable
environment.

This specification outlines the functional and non-functional requirements of the system, its
operating conditions, and a summary of its design structure.

3.2 Operating Environment

Software Requirements

● Operating System: Windows 10 / Windows 11 / Ubuntu 20.04+ / macOS (any modern

version)
● Programming Language: Python 3.8 or higher
● Development Environment: Jupyter Notebook, Visual Studio Code
● Libraries and Dependencies:
○ NumPy (for numerical computations)
○ Matplotlib (for plotting and visualization)
○ Pandas (for dataset handling)
● Notebook Runtime: Jupyter Lab / Jupyter Notebook
● Dataset Sources: Local .csv files and synthetic datasets generated via scripts

Hardware Requirements

● Processor: Dual-core CPU (Intel i3 or equivalent, minimum)

● RAM: 4 GB (minimum), 8 GB or more recommended
● Storage: Minimum 500 MB of free disk space for code, logs, and plots
● Graphics: No dedicated GPU required (CPU-based training)

6
3.3 Functional Requirements

The system should provide the following core functionalities for users working within Python
notebooks or scripts:

1. Data Handling

● Generate simple datasets programmatically (e.g., AND gate)
● Load and preprocess external datasets (e.g., standardization, class balancing)
● Perform representative train-test splits for both classification and regression tasks

2. Model Construction and Configuration

● Define and configure custom neural network architectures by stacking layers
● Set activation functions (e.g., ReLU, Sigmoid) and loss functions
● Initialize model weights and biases

3. Training and Optimization

● Perform forward propagation to compute model predictions
● Compute loss and apply backward propagation to calculate gradients
● Update model parameters using optimization algorithms (e.g., SGD)
● Log training metrics (loss, accuracy, etc.) to a file for later reference

4. Visualization and Debugging

● Plot gradients, weight updates, and performance metrics across epochs
● Visualize output predictions and loss landscapes

5. Model Evaluation and Testing

● Assess model performance using classification accuracy or R² score
● Save and load trained model weights for reuse or analysis

3.4 Non-Functional Requirements

1. Usability
The system should be intuitive and user-friendly for individuals familiar with Python
programming and Jupyter notebooks. Users should be able to interact with the system by
importing its components as Python modules and using them with minimal setup. The codebase
should follow clear naming conventions and be well-documented to support learning and
experimentation. The interface should encourage educational use and provide clarity over
automation.

2. Modularity and Maintainability

The system should be designed in a modular fashion, with separate files or packages responsible
for distinct functionalities such as data processing, model definition, training, and plotting. This
modular structure should make it easy to isolate, modify, or extend specific components without

7
affecting others. The architecture should support long-term maintainability and allow new
features or improvements to be integrated with minimal disruption.

3. Performance
The system should perform efficiently for small to medium-sized datasets on CPU-based
machines. It should support mini-batch training to manage computational load and allow
flexibility in tuning performance during training. While not optimized for large-scale use, the
system should maintain acceptable responsiveness and throughput during experimentation.

4. Scalability
Although the system is not intended for industrial-scale deployment, it should be scalable within
the context of academic or prototype-level tasks. It should allow users to build deeper models,
modify layer configurations, and process larger batches without requiring structural code
changes. The design should make it possible to scale complexity upward for controlled
experimentation.

5. Reproducibility and Transparency

The system should provide full visibility into the training process through logging and
visualizations. Key training metrics such as loss, gradients, accuracy scores, and model
configurations should be recorded and reproducible across runs. Users should be able to
reproduce experimental results, e.g. through consistent random seeds and saved weights. The
system should offer visual tools to trace how model parameters and outputs evolve over time.

8
4. DESIGN

4.1 Use Case Diagram

9
4.2 Class Diagram

10
4.3 Sequence Diagram

11
4.4 Data Flow Diagram

12
4.5 System Architecture

13
5. IMPLEMENTATION

5.1 Sample Code

1. dataset_utils.py

import numpy as np
from numpy.random import default_rng
import pandas as pd

def get_vector(seed = 1, upper_bound = 10, n_samples = 10):

generator = default_rng(seed)
vector = upper_bound * generator.random((n_samples, 1))
return vector

def and_gate_dataset(positive_samples = 100, seed = 1000):

generator = default_rng(seed)
seeds = generator.choice(1000, 4, replace = False)
# features
both_positive = np.hstack((get_vector(seed = seeds[0], n_samples = positive_samples),
get_vector(seed = seeds[1], n_samples = positive_samples)))
one_zero_1 = np.hstack((np.zeros((positive_samples//2, 1)), get_vector(seed = seeds[2],
n_samples = positive_samples//2)))
one_zero_2 = np.hstack((get_vector(seed = seeds[3], n_samples = positive_samples//2),
np.zeros((positive_samples//2, 1))))
both_zero = np.array([0, 0])
features = np.vstack((both_positive, one_zero_1, one_zero_2, both_zero))
# labels
labels_positive = np.ones((positive_samples, 1), dtype = int)
labels_negative = np.zeros((positive_samples//2 + positive_samples//2 + 1, 1), dtype = int)
labels = np.vstack((labels_positive, labels_negative))

dataset = np.hstack((features, labels))

generator.shuffle(dataset)
return dataset[:, [0, 1]], dataset[:, [2]]

def get_minibatch(features, targets, batch_size = 1, start_at = 0):

# give this the same stuff every time
if start_at >= features.shape[0]:
print("invalid start_at for get_minibatch")
return None, None

14
return features[start_at:min(features.shape[0], start_at + batch_size)],
targets[start_at:min(targets.shape[0], start_at + batch_size)]

def standardize_data(features, include_indices = [], exclude_indices = [], from_means = None,

from_stds = None): # in-place operation
means = list()
stds = list()
from_counter = 0
for i in range(features.shape[1]):
if (i in include_indices and i not in exclude_indices) or (len(include_indices) == 0 and
len(exclude_indices) == 0):
mean = np.mean(features[:, i]) if from_means is None else from_means[from_counter]
std = np.std(features[:, i]) if from_stds is None else from_stds[from_counter]
features[:, i] = (features[:, i] - mean) / std
means.append(mean)
stds.append(std)
from_counter += 1
return means, stds

2. functions.py

import numpy as np

@np.vectorize
def relu(x):
return x if x > 0 else 0

def leaky_relu(leak = 0.1, alpha = 1):

@np.vectorize
def func(x):
return alpha * x if x > 0 else x*leak # leak is accessed through a 'closure'
return func

def der_leaky_relu(leak = 0.1, alpha = 1):

@np.vectorize
def func(x):
return alpha if x > 0 else leak
return func

@np.vectorize
def sigmoid(x):

15
return 1 / (1 + np.exp(-x))

@np.vectorize
def round_off(x):
return 1 if x >= 0.5 else 0

@np.vectorize
def mirror(x):
return x

class MSE:
def calculate_loss(self, labels, predictions):
labels = labels.reshape(predictions.shape)
return (1 / labels.size) * np.sum((labels - predictions)**2)

def der_loss(self, label, output):

return 2 * (output - label)

def error_output_layer(self, layer, label):

first_term = self.der_loss(label, layer.a_[0][0])
second_term = layer.der_activate()
return first_term * second_term

class BinaryLoss(MSE):
def calculate_loss(self, labels, predictions, epsilon = 1e-7):
total = predictions.size
labels = labels.reshape(predictions.shape)
summation = 0

predictions = np.clip(predictions, epsilon, 1 - epsilon) # avoiding values of 0 ( => inf loss)

and 1 ( => 0 loss)
iterator = zip(labels, predictions)
for label, prediction in iterator:
summation += label * np.log(prediction) + (1 - label) * np.log(1 - prediction)
return (-1 / total) * summation

def der_loss(self, label, output, epsilon = 1e-7): # output belongs to range [0, 1]
output = np.clip(output, epsilon, 1 - epsilon)
return ((1 - label)/(1 - output)) - (label / output)

16
3. model_classes.py

import numpy as np
from numpy.random import default_rng
from math import sqrt
import os

class Layer:

def init(self, n_neurons, activation = None, der_activation = None):

self.n_neurons = n_neurons
self.activation = activation
self.der_activation = der_activation
self.z_ = np.zeros((n_neurons, 1))
self.del_ = np.zeros((n_neurons, 1))
self.b_gradients = np.zeros((n_neurons, 1))
self.a_ = np.zeros((n_neurons, 1))
self.b_ = np.zeros((n_neurons, 1))

def init_biases(self):
self.b_ = np.zeros((self.n_neurons, 1))

def activate(self):
self.a_ = self.activation(self.z_)

def der_activate(self):
return self.der_activation(self.z_)

class Weights:

def init(self, layer_1, layer_2, seed = 1000):

self.rows = layer_2.n_neurons
self.cols = layer_1.n_neurons
self.seed = seed
self.layer_1 = layer_1
self.layer_2 = layer_2
self.matrix = self.init_weights(self.rows, self.cols, self.seed)
self.gradients = np.zeros((self.rows, self.cols))

def init_weights(self, destination_neurons, source_neurons, seed): # source_neurons = fan_in

std = sqrt(2 / source_neurons) # standard deviation for 'He' initialization

17
generator = default_rng(seed)

weights = generator.standard_normal((destination_neurons, source_neurons)) * std

return weights

class Model:

def init(self, loss, seed = 1000):

self.layers = list()
self.weights = list()
self.loss = loss # 'class' for the loss function
self.seed = seed

def add_layer(self, layer):

self.layers.append(layer)

def compile(self):
generator = default_rng(seed = self.seed)
seeds = generator.integers(0, len(self.layers)*100, (len(self.layers)-1,))

self.weights = list() # allows re-compilation

for i in range(len(self.layers)-1):
self.weights.append(Weights(self.layers[i], self.layers[i+1], seed = seeds[i]))

for layer in self.layers:

layer.init_biases()

4. optimizers.py

import numpy as np

# optimizer will give the trainer the gradients[] matrix to apply to the weights.
class SGD():
def set_model(self, model):
self.model = model

def gradient_weights(self, weight_index):

return self.model.weights[weight_index].gradients

def gradient_biases(self, layer_index):

return self.model.layers[layer_index].b_gradients

18
def current_gradient(self, weight_index):
return np.matmul(self.model.weights[weight_index].layer_2.del_,
np.transpose(self.model.weights[weight_index].layer_1.a_))

def error_output_layer(self, label):

return self.model.loss.error_output_layer(self.model.layers[-1], label)

def error_layer(self, this_index, weight_index): # weights connecting this layer to next layer
return np.matmul(
np.transpose(self.model.weights[weight_index].matrix),
self.model.layers[this_index+1].del_) * self.model.layers[this_index].der_activate()

def on_pass(self):
# reset gradients
for weight in self.model.weights:
weight.gradients = np.zeros((weight.rows, weight.cols))

# reset errors
for layer in self.model.layers:
layer.del_ = np.zeros((layer.n_neurons, 1))
layer.b_gradients = np.zeros((layer.n_neurons, 1))

def update_biases(self, layer_index, l_rate):

value = l_rate * self.gradient_biases(layer_index)
self.model.layers[layer_index].b_ -= value
return value

def update_weights(self, weight_index, l_rate):

value = l_rate * self.gradient_weights(weight_index)
self.model.weights[weight_index].matrix -= value
return value

5. plotter.py

import numpy as np
from numpy.random import default_rng
import matplotlib.pyplot as plt
from nn.trainer import Logger
import os, time

19
class Plotter:
def read_file(self, path):
self.data = Logger.load_data(path)
if self.data is None:
print("unsucessful.")
else:
print("file read.")
def plot_gradients(self, dir, name = None, n_points = None):
if not os.access(dir, os.F_OK):
print(f'cannot access {dir}')
return
time_str = time.strftime("%I-%M-%S_%p", time.localtime(time.time()))
name = "/gradients_" + (time_str if name is None else name) + ".png"

weights_gradients = dict()
bias_gradients = dict()

for i in range(self.data['n-epochs']):
epoch = f'epoch-{i}'
for j in range(self.data[epoch]['n-updates']):
update = f'update-{j}'
for w in self.data['n_weights']:
weight = f'weights-gradient-{w}'
if weight not in weights_gradients:
weights_gradients[weight] = list()
array = np.asarray(self.data[epoch][update][weight]).ravel()[:4]
if array.shape[0] > 1:
array = array.reshape((2, -1))
else:
array = array.reshape((1, 1))
weights_gradients[weight].append(array)
for b in self.data['n_weights']:
b += 1
bias = f'bias-gradient-{b}'
if bias not in bias_gradients:
bias_gradients[bias] = list()
array = np.asarray(self.data[epoch][update][bias]).ravel()[:4]
array = array.reshape((-1, 1))
bias_gradients[bias].append(array)

20
fig, axs = plt.subplots(2, len(self.data['n_weights']), figsize = (7 *
len(self.data['n_weights']), 8), gridspec_kw = {'wspace': 0.2, 'hspace': 0.3}, squeeze = False)
fig.suptitle(f'Gradients', size = 'xx-large')

# weights
for ax in range(len(self.data['n_weights'])):
weight = f"weights-gradient-{self.data['n_weights'][ax]}"
weights_gradients[weight] = np.asarray(weights_gradients[weight]) # convert entire
history of gradients into one numpy array
if n_points is not None:
weights_gradients[weight] = weights_gradients[weight][:n_points]
for r in range(weights_gradients[weight][0].shape[0]):
for c in range(weights_gradients[weight][0].shape[1]):
axs[0, ax].plot(weights_gradients[weight][:, r, c], label = f'{r*2 + c}', linewidth =
0.7)
axs[0, ax].legend(title = 'elements')
axs[0, ax].set_title(f"weights-{self.data['n_weights'][ax]}")
axs[0, ax].set_xlabel("Updates")

# bias
for ax in range(len(self.data['n_weights'])):
bias = f"bias-gradient-{self.data['n_weights'][ax] + 1}"
bias_gradients[bias] = np.asarray(bias_gradients[bias])
if n_points is not None:
bias_gradients[bias] = bias_gradients[bias][:n_points]
for r in range(bias_gradients[bias][0].shape[0]):
axs[1, ax].plot(bias_gradients[bias][:, r, 0], label = f'{r}', linewidth = 0.7)
axs[1, ax].legend(title = 'elements')
axs[1, ax].set_title(f"bias-{self.data['n_weights'][ax] + 1}")
axs[1, ax].set_xlabel("Updates")

fig.savefig(dir + name, bbox_inches = "tight")

print(f'plot saved at {dir + name}')
plt.show()

6. trainer.py

import numpy as np
from numpy.random import default_rng
from nn.dataset_utils import get_minibatch
from nn.functions import round_off

21
import json, os, time

class Trainer:

def init(self, model, optimizer):

self.model = model
self.optimizer = optimizer
self.optimizer.set_model(self.model)
self.logger = Logger()

def backward_pass(self, label): # single instance

self.error_output_layer(label)
for i in range(len(self.model.layers)-2, 0, -1): # except the input layer
self.error_layer(i, i)

for i in range(len(self.model.weights)):
self.current_gradient(i)

def forward_pass(self, input_data):

# input_data shape = (1, n_cols)
self.model.layers[0].a_ = input_data.reshape((input_data.shape[-1], 1))
for i in range(1, len(self.model.layers)):
self.model.layers[i].z_ = np.matmul(self.model.weights[i-1].matrix,
self.model.layers[i-1].a_) + self.model.layers[i].b_
self.model.layers[i].activate()

def confusion_matrix(self, labels, predictions):

predictions = predictions.reshape((predictions.size, 1))
labels = labels.reshape((labels.size, 1))
stack = np.hstack((labels, predictions))
matrix = {'tp': 0, 'tn': 0, 'fp': 0, 'fn': 0}

for i in range(stack.shape[0]):
if stack[i][0] == 1:
if stack[i][1] == 1:
matrix['tp'] += 1
else:
matrix['fn'] += 1
else:
if stack[i][1] == 1:
matrix['fp'] += 1

22
else:
matrix['tn'] += 1
return matrix

def update_biases(self, layer_index):

return self.optimizer.update_biases(layer_index, self.learning_rate) # final, applied value
(includes scaling by learning rate)

def update_weights(self, weight_index):

return self.optimizer.update_weights(weight_index, self.learning_rate)

7. classification.ipynb (AND gate)

from nn.model_classes import Model, Layer

from nn.functions import BinaryLoss, leaky_relu, der_leaky_relu, sigmoid, der_sigmoid
from nn.trainer import Trainer
from nn.plotter import Plotter
from nn.dataset_utils import and_gate_dataset
from nn.optimizers import SGD
import numpy as np

model = Model(BinaryLoss(), 1)
model.add_layer(Layer(2))
model.add_layer(Layer(2, leaky_relu(), der_leaky_relu()))
model.add_layer(Layer(1, sigmoid, der_sigmoid))
model.compile()

trainer = Trainer(model, SGD())

X_train, y_train = and_gate_dataset(100, 1)

dataset = np.hstack((X_train, y_train))
print(dataset[:10])

X_test, y_test = and_gate_dataset(50, 2)

test_set = np.hstack((X_test, y_test))
print(test_set[:10])

plotter = Plotter()

model.compile()

print('weights:')
model.show_weights()
print('biases:')
model.show_biases()

23
trainer.train(X_train, y_train, 1, 0.02, epochs = 120)

trainer.save_history('./logs', 'batch_size_1')

_ = trainer.predict(X_test, y_test, True)

model.save_weights('./models', 'batch_size_1')

plotter.read_file('./logs/batch_size_1.txt')
plotter.plot_gradients('./plots', 'gradients_batch_size_1', 700)
plotter.plot_weights('./plots', 'weights_batch_size_1', 700)
plotter.plot_score('./plots', 'accuracy_batch_size_1', 700)

8. dataset_classification.ipynb (Kaggle – classification)

import pandas as pd
import numpy as np
from nn.model_classes import Model, Layer
from nn.functions import BinaryLoss, leaky_relu, der_leaky_relu, sigmoid, der_sigmoid
from nn.trainer import Trainer
from nn.plotter import Plotter
from nn.dataset_utils import pca, standardize_data, split_classes
from nn.optimizers import SGD

dataset = pd.read_csv(r'datasets\raisin_Sruthi.csv')

dataset.info()

dataset = pd.get_dummies(dataset, columns = ["Class"], prefix = "", prefix_sep = "", drop_first

= True, dtype = int)
dataset.head()

D_train, D_test = split_classes(dataset, 'Kecimen')

X_train = D_train.drop(columns = ['Kecimen'])

y_train = D_train['Kecimen']

X_train = X_train.to_numpy()
X_means, X_stds = standardize_data(X_train)
X_train

model = Model(BinaryLoss(), 5)
model.add_layer(Layer(7))
model.add_layer(Layer(7, leaky_relu(), der_leaky_relu()))
model.add_layer(Layer(7, leaky_relu(), der_leaky_relu()))
model.add_layer(Layer(7, leaky_relu(), der_leaky_relu()))
model.add_layer(Layer(1, sigmoid, der_sigmoid))

24
model.compile()

trainer = Trainer(model, SGD())

y_train = y_train.to_numpy()
y_train[:5]

trainer.train(X_train, y_train, 16, 0.02, 20)

trainer.save_history('./logs', 'dataset_classification')
model.save_weights("./models", "dataset_classification")

plotter = Plotter()
plotter.read_file(r'logs\dataset_classification.txt')

plotter.plot_gradients("./plots", "dataset_classification", 700)

plotter.plot_weights("./plots", "dataset_classification", 700)
plotter.plot_score("./plots", "dataset_classification", 700, False)

9. dataset_regression.ipynb (Kaggle – regression)

import pandas as pd
from nn.model_classes import Model, Layer
from nn.functions import MSE, leaky_relu, der_leaky_relu, sigmoid, der_sigmoid, mirror,
der_mirror
from nn.trainer import RegressionTrainer
from nn.plotter import Plotter
from nn.dataset_utils import pca, standardize_data, split_data
from nn.optimizers import SGD

dataset = pd.read_csv(r'datasets\expenses_Sruthi.csv')

dataset.info()

dataset = pd.get_dummies(dataset, drop_first = True, dtype = int)

dataset.head()

D_train, D_test = split_data(dataset, 0.3, 10)

X_train = D_train.drop(columns = ['charges'])

y_train = D_train['charges']

model = Model(MSE(), 5)
model.add_layer(Layer(8))
model.add_layer(Layer(8, leaky_relu(), der_leaky_relu()))
model.add_layer(Layer(8, leaky_relu(), der_leaky_relu()))
model.add_layer(Layer(1, mirror, der_mirror))
model.compile()

25
trainer = RegressionTrainer(model, SGD())

X_train = X_train.to_numpy()
y_train = y_train.to_numpy()
y_train = y_train.reshape((y_train.shape[0], 1))
X_means, X_stds = standardize_data(X_train, [0, 1])
y_means, y_stds = standardize_data(y_train)

trainer.train(X_train, y_train, 4, 0.02, 25)

trainer.save_history('./logs', 'dataset_regression')

plotter = Plotter()
plotter.read_file(r'logs\dataset_regression.txt')
plotter.plot_gradients("./plots", "dataset_regression", 700)
plotter.plot_weights("./plots", "dataset_regression", 700)
plotter.plot_score("./plots", "dataset_regression", 700, False)

26
6. TESTING

To ensure the correctness and reliability of the neural network system, several forms of testing
were conducted. Given that this is not a traditional software application with UI elements or
discrete functional modules, the focus of testing shifted toward verifying the correctness of
mathematical computations (forward and backward passes), observing learning behavior over
time, and confirming that the system could make accurate predictions on given datasets.

The testing approach involved:

● Verifying outputs of individual components (such as activation and loss functions).

● Using simple datasets like the AND gate to confirm basic learning behavior.
● Applying the network to more complex datasets to validate generalization.
● Visualizing training metrics such as loss, gradients, and model outputs to monitor internal
behavior during training.
● Visual plots generated using matplotlib (covered in Section 7) played a central role in
debugging and confirming system behavior.

6.1 Test Cases

Test Case-1 Forward Pass Output (AND Gate)

Description Check that a trained model can output correct values for AND gate
inputs after training.

Input Four 2D binary combinations: [0,0], [0,1], [1,0], [1,1]

Expected Output Approximately 0 for first three inputs, and near 1 for [1,1]

Observed Result Final predictions after training: [0.03, 0.06, 0.07, 0.95]

Test Case-2 Gradient Flow Consistency

Description Ensure that gradients computed during backpropagation decrease in

magnitude over time and don’t vanish or explode.

Input A small classification dataset (AND gate, batch size 1) trained over 100
epochs.

27
Expected Output Smooth, converging gradient trajectories; no abrupt spikes.

Observed Result Plots of gradients confirmed a steady downward trend and numerical
stability during updates.

Test Case-3 Classification Accuracy on Kaggle Dataset

Description Evaluate classification performance on a moderately complex dataset

after full training.

Input Preprocessed dataset (features and labels), 80/20 train-test split.

Expected Output Training and test accuracy should be >85% with no overfitting or
underfitting symptoms.

Observed Result Final training accuracy: 85%, test accuracy: 81%

Test Case-4 Loss Curve Behavior

Description Check that loss value decreases consistently across epochs during
training.

Input Any dataset with a known ground truth; typically trained over 100+
epochs.

Expected Output Smooth downward loss curve; convergence after sufficient iterations.

Observed Result All training sessions produced typical exponential decay loss patterns;
no signs of divergence or instability.

28
7. SCREENSHOTS

This section presents a collection of visual outputs and screenshots that highlight key aspects of
the project. The majority of the images consist of graphs generated during model training, which
served as critical tools for observing and validating the behavior of the neural network.
Additionally, this section includes selected screenshots of the development environment (VS
Code and Jupyter notebooks), showcasing how different modules interact during training and
evaluation. Together, these visuals offer a comprehensive look into both the internal dynamics of
the neural network and the engineering effort behind the system.

7.1 AND gate dataset – Classification Task

29
30
31
7.2 Kaggle dataset – Classification Task

32
33
7.3 Kaggle dataset – Regression Task

34
35
36
7.4 Project Structure (VS Code)

37
8. CONCLUSION

The primary goal of this project was to gain a foundational understanding of neural networks by
implementing one from scratch without relying on high-level machine learning libraries.
Through this process, we successfully constructed a fully functional feedforward neural network
using only NumPy and Python, carefully designing every component including the forward pass,
backpropagation, loss computation, gradient updates, and training loop. The network was
evaluated on both simple and moderately complex datasets, where it demonstrated strong
learning capabilities and competitive performance. Visual tools played a central role in our
workflow, providing crucial insights into the training dynamics and helping us verify the
correctness of the implementation.

This hands-on approach allowed us to go beyond the surface-level use of popular frameworks
and engage directly with the mathematical and algorithmic core of deep learning. We not only
implemented the foundational equations but also handled data preprocessing, training utilities,
modular design, and real-time monitoring. This comprehensive experience helped solidify our
understanding of key concepts such as weight initialization, gradient descent, overfitting, and
loss landscapes. Overall, the project was successful in achieving its pedagogical objectives while
also delivering a practical and testable neural network system.

Future Scope
While the current project successfully delivered a functional neural network system built from
first principles, there remains significant room for both horizontal and vertical expansion. Future
work can further improve the system’s learning capabilities, flexibility, and applicability across
different problem domains — all while remaining grounded in the context of fundamental
feedforward neural networks.

Potential directions for future work include:

1. Enhancing the core system:

● Implementing advanced optimization algorithms from scratch, such as Adam, RMSProp,

or momentum-based SGD.
● Incorporating regularization techniques like L1/L2 weight penalties or dropout to
improve generalization.
● Exploring alternative, gradient-free optimization techniques such as Particle Swarm
Optimization (PSO), Genetic Algorithms (GA), or Differential Evolution (DE) for
training the network. These methods could offer insights into global search behavior,
robustness to local minima, and performance in cases where gradient descent struggles.

38
● Adding support for early stopping, dynamic learning rate schedules, and other training
control mechanisms.

2. Application-focused extensions:

● Designing an online learning module that allows the model to train incrementally on
streaming data, enabling use in dynamic environments such as anomaly detection.
● Developing a use-case-specific application — for example, a cyberattack detection
system — where data is generated or labeled in real time, simulating a high-stakes
decision-making scenario.
● Applying the network to sensor-based domains such as predictive maintenance or activity
recognition, where the temporal structure of incoming data can still be explored with
feedforward models using engineered features.
● Building a simple graphical interface or API endpoint so that non-technical users can
input data and view model predictions interactively.

These extensions would not only make the system more robust and versatile but also highlight
the potential of even basic neural architectures when carefully engineered and applied to
domain-specific challenges.

39
REFERENCES
[1] Mellah, Hacene, Kamel Eddine Hemsas, and Rachid Taleb. "Cascade-Forward Neural
Network Based on Resilient Backpropagation for Simultaneous Parameters and State Space
Estimations of Brushed DC Machines." arXiv preprint arXiv:2104.04348 (2021).
[2] Bülte, Christopher, et al. "Graph Neural Networks for Enhancing Ensemble Forecasts of
Extreme Rainfall." arXiv preprint arXiv:2504.05471 (2025).
[3] Karabayir, Ibrahim, Oguz Akbilgic, and Nihat Tas. "A novel learning algorithm to optimize
deep neural networks: Evolved gradient direction optimizer (EVGO)." IEEE transactions
on neural networks and learning systems 32.2 (2020): 685-694.
[4] Yang, Shangshang, et al. "A gradient-guided evolutionary approach to training deep neural
networks." IEEE Transactions on Neural Networks and Learning Systems 33.9 (2021):
4861-4875.
[5] Na, Weicong, et al. "Deep neural network with batch normalization for automated
modeling of microwave components." 2020 IEEE MTT-S International Conference on
Numerical Electromagnetic and Multiphysics Modeling and Optimization (NEMO). IEEE,
2020.
[6] Li, Yixing, and Fengbo Ren. "Bnn pruning: Pruning binary neural network guided by
weight flipping frequency." 2020 21st International Symposium on Quality Electronic
Design (ISQED). IEEE, 2020.

Bibliography
● Documentation: NumPy – https://numpy.org/doc/2.2/
● Documentation: Matplotlib – https://matplotlib.org/stable/index.html
● Documentation: Pandas – https://pandas.pydata.org/docs/
● Documentation: Python 3.9 – https://docs.python.org/3.9/
● Article: “Neural Networks Representation” –
https://www.jeremyjordan.me/intro-to-neural-networks/
● Article: “Mechanics of a Simple Neural Network” –
https://shotlefttodatascience.com/2020/08/17/the-mechanics-of-a-simple-neural-network/
● Article: “Neural Networks Structure” –
https://python-course.eu/machine-learning/neural-networks-structure-weights-and-matrices.p
hp
● Article: “Weight Initialization Techniques in Neural Networks” –
https://www.pinecone.io/learn/weight-initialization/
● Article: “How the Backpropagation Algorithm Works” –
http://neuralnetworksanddeeplearning.com/chap2.html
● Article: “Gradient Accumulation” –
https://medium.com/@harshit158/gradient-accumulation-307de7599e87

40
● Article: “Visualizing the Loss Landscape of a Neural Network” –
https://mathformachines.com/posts/visualizing-the-loss-landscape/#the-linear-case
● Article: “Principal Component Analysis” –
https://www.turing.com/kb/guide-to-principal-component-analysis#step-4:-feature-vector
● Kaggle Regression dataset: “Medical Insurance” –
https://www.kaggle.com/datasets/harshsingh2209/medical-insurance-payout
● Kaggle Classification dataset – “Raisin Binary Classification” –
https://www.kaggle.com/datasets/nimapourmoradi/raisin-binary-classification

Temenos Training Course Catalogue PDF
100% (4)
Temenos Training Course Catalogue PDF
80 pages
Sample Question Paper Assistant Director IT Cadre 20240729
No ratings yet
Sample Question Paper Assistant Director IT Cadre 20240729
17 pages
18c Manual 2012
No ratings yet
18c Manual 2012
2 pages
Panasonic - TDE - NCP - SIP Trunking Configuration Guide - FINAL
No ratings yet
Panasonic - TDE - NCP - SIP Trunking Configuration Guide - FINAL
10 pages
01 NFV Fundamental Introduction
No ratings yet
01 NFV Fundamental Introduction
42 pages
Presentation Mikrotik
No ratings yet
Presentation Mikrotik
68 pages
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
César Pérez López
No ratings yet
Bayesian Variational Recurrent Neural Networks For Prognostics and Health Management of Complex Systems
No ratings yet
Bayesian Variational Recurrent Neural Networks For Prognostics and Health Management of Complex Systems
99 pages
Deep Learning Lab Practicals
No ratings yet
Deep Learning Lab Practicals
24 pages
DL using Python and C++
No ratings yet
DL using Python and C++
35 pages
GK Deeplearning
No ratings yet
GK Deeplearning
15 pages
Advanced Techniques in Machine Learning and Optimization (3)
No ratings yet
Advanced Techniques in Machine Learning and Optimization (3)
8 pages
DSPD Report
No ratings yet
DSPD Report
8 pages
Nguyen Duy
No ratings yet
Nguyen Duy
66 pages
Introduction to Convolutional Neural Networks (1)
No ratings yet
Introduction to Convolutional Neural Networks (1)
4 pages
Sid AIML SEM6
No ratings yet
Sid AIML SEM6
32 pages
AI Manual
No ratings yet
AI Manual
36 pages
3.implementation of Neural Networks
No ratings yet
3.implementation of Neural Networks
2 pages
DL Lab Manual
No ratings yet
DL Lab Manual
52 pages
Deep Learning Basics
No ratings yet
Deep Learning Basics
28 pages
DL Practical
No ratings yet
DL Practical
14 pages
Lab 12
No ratings yet
Lab 12
6 pages
hw5
No ratings yet
hw5
10 pages
Review of Deep Learning Algorithms and Architectur
No ratings yet
Review of Deep Learning Algorithms and Architectur
29 pages
Deep Learning Models (Basic)
No ratings yet
Deep Learning Models (Basic)
35 pages
Prediction of Financial Markets Using Deep Learning: Masaryk University
No ratings yet
Prediction of Financial Markets Using Deep Learning: Masaryk University
66 pages
ML Lab 11 Manual - Neural Networks (Ver4)
No ratings yet
ML Lab 11 Manual - Neural Networks (Ver4)
8 pages
CCS355-Neural networks and deep learning_____Assignment 1
No ratings yet
CCS355-Neural networks and deep learning_____Assignment 1
15 pages
Morgan & Claypool - Introduction To Deep Learning For Engineers Using Python and Google Clod Platform - 2020
No ratings yet
Morgan & Claypool - Introduction To Deep Learning For Engineers Using Python and Google Clod Platform - 2020
111 pages
24CH10039 AGV Task 4
No ratings yet
24CH10039 AGV Task 4
3 pages
Pytorch Neural Networks Guide 1717173717
No ratings yet
Pytorch Neural Networks Guide 1717173717
17 pages
3rd Unit ML
No ratings yet
3rd Unit ML
7 pages
Deep Learning and Neural Networks
No ratings yet
Deep Learning and Neural Networks
8 pages
Sony Ai Content[1]
No ratings yet
Sony Ai Content[1]
26 pages
Final_DL
No ratings yet
Final_DL
26 pages
ML Unit-5
No ratings yet
ML Unit-5
11 pages
16 Mikami
No ratings yet
16 Mikami
27 pages
Pattern Recognition Using Neural Network (Project Proposal For Image Processing)
No ratings yet
Pattern Recognition Using Neural Network (Project Proposal For Image Processing)
6 pages
Lab Report (1) Bachpan
No ratings yet
Lab Report (1) Bachpan
29 pages
Thesis Master 2022 Application of GNN For Graph Classification
No ratings yet
Thesis Master 2022 Application of GNN For Graph Classification
81 pages
Lecture W15ab
No ratings yet
Lecture W15ab
44 pages
CVD Lab Manual
No ratings yet
CVD Lab Manual
33 pages
Thesis 2021 Optimizaiton GNN Stathas Nistath Meng Eecs 2021 Thesis
No ratings yet
Thesis 2021 Optimizaiton GNN Stathas Nistath Meng Eecs 2021 Thesis
79 pages
Institute of Engineering and Technology Davv, Indore: Lab Assingment On
No ratings yet
Institute of Engineering and Technology Davv, Indore: Lab Assingment On
14 pages
ROWBACK Robust Watermarking For Neural
No ratings yet
ROWBACK Robust Watermarking For Neural
8 pages
TFG Final
No ratings yet
TFG Final
75 pages
Backpropagation Algorithm
No ratings yet
Backpropagation Algorithm
29 pages
Ug4 Proj
No ratings yet
Ug4 Proj
44 pages
ML Hota Assign5
No ratings yet
ML Hota Assign5
2 pages
DLP Lab
No ratings yet
DLP Lab
81 pages
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
deep learning lab manual - 23-24
No ratings yet
deep learning lab manual - 23-24
41 pages
Deep Learning concise notes
No ratings yet
Deep Learning concise notes
4 pages
d2l en
No ratings yet
d2l en
883 pages
Rupam's Master Thesis
No ratings yet
Rupam's Master Thesis
58 pages
Deep learning in computational mechanics a review
No ratings yet
Deep learning in computational mechanics a review
51 pages
Implementation of Time Series Forecasting
No ratings yet
Implementation of Time Series Forecasting
12 pages
Project Documentation
No ratings yet
Project Documentation
24 pages
Eng Ppt Tech
No ratings yet
Eng Ppt Tech
18 pages
Report (1)
No ratings yet
Report (1)
14 pages
Building A Neural Network From Scratch C++: Name of Student: Niranjan Class: XII Year: 2019 - 2020
No ratings yet
Building A Neural Network From Scratch C++: Name of Student: Niranjan Class: XII Year: 2019 - 2020
40 pages
Manual_ Deep Learning Lab.
No ratings yet
Manual_ Deep Learning Lab.
43 pages
09 Tensorflow101 Slide
No ratings yet
09 Tensorflow101 Slide
78 pages
Neural Networks Essay
No ratings yet
Neural Networks Essay
5 pages
Control System Term Paper
No ratings yet
Control System Term Paper
12 pages
Deep Learning
No ratings yet
Deep Learning
46 pages
NCR-Model
No ratings yet
NCR-Model
4 pages
Review-1_PPT[1]
No ratings yet
Review-1_PPT[1]
14 pages
Health Monitoring on Social Media over Time
No ratings yet
Health Monitoring on Social Media over Time
6 pages
Data_Flow_Diagram
No ratings yet
Data_Flow_Diagram
1 page
Conclusion
No ratings yet
Conclusion
1 page
Timers and Counters
No ratings yet
Timers and Counters
5 pages
3306 Multi-Resource Scheduling
No ratings yet
3306 Multi-Resource Scheduling
16 pages
NCSC-TG-023 A Guide To Security Testing and Test Documentation in Trusted Systems (Bright Orange Book)
No ratings yet
NCSC-TG-023 A Guide To Security Testing and Test Documentation in Trusted Systems (Bright Orange Book)
124 pages
BethSecure - Ensuring Data Security in Cloud Computing
No ratings yet
BethSecure - Ensuring Data Security in Cloud Computing
20 pages
Features of R and Its Applications
No ratings yet
Features of R and Its Applications
2 pages
Linux Foundation Passleader Cks Exam Dumps 2024-Feb-04 by Noah 18q Vce
No ratings yet
Linux Foundation Passleader Cks Exam Dumps 2024-Feb-04 by Noah 18q Vce
9 pages
Cao Assignment
No ratings yet
Cao Assignment
2 pages
CS401P-Assignment 2 Solution Fall 2024 by M.junaid Qazi
No ratings yet
CS401P-Assignment 2 Solution Fall 2024 by M.junaid Qazi
3 pages
HD PRO 2
No ratings yet
HD PRO 2
2 pages
BDA Question bank with solutions
No ratings yet
BDA Question bank with solutions
88 pages
44642741M PDF
No ratings yet
44642741M PDF
2 pages
Lab Manual: CSC-216 (L6) - Data Structures
No ratings yet
Lab Manual: CSC-216 (L6) - Data Structures
8 pages
Power Off Reset Reason
No ratings yet
Power Off Reset Reason
2 pages
Micom S1 Studio: Micom Ied Support Software
No ratings yet
Micom S1 Studio: Micom Ied Support Software
2 pages
pl-300_3
No ratings yet
pl-300_3
56 pages
Bootcamp Schedule
No ratings yet
Bootcamp Schedule
1 page
Job Aid For Taxpayers - How To Fill Up 1701 Version 2013
No ratings yet
Job Aid For Taxpayers - How To Fill Up 1701 Version 2013
35 pages
C++ Notes
No ratings yet
C++ Notes
3 pages
RTN 950 V100R011C00 User Guide For North America 02 PDF
No ratings yet
RTN 950 V100R011C00 User Guide For North America 02 PDF
108 pages
Custom-Made Galvanostat-Potentiostat and Potentiometer For Decentralized Measurements of Ionophore-Based Membranes
No ratings yet
Custom-Made Galvanostat-Potentiostat and Potentiometer For Decentralized Measurements of Ionophore-Based Membranes
45 pages
Constant Voltage and Constant Current DC Power Supply RD6006 Instruction 20200311
No ratings yet
Constant Voltage and Constant Current DC Power Supply RD6006 Instruction 20200311
29 pages
Messages Sent To QSYSMSG
No ratings yet
Messages Sent To QSYSMSG
4 pages
Configuration of Block Assets SAP Note 738919 1725621664
No ratings yet
Configuration of Block Assets SAP Note 738919 1725621664
23 pages
Internal Marks Assessment Sy Stem: Master of Computer Applications
100% (2)
Internal Marks Assessment Sy Stem: Master of Computer Applications
24 pages

Project Report 4th Year

Uploaded by

Project Report 4th Year

Uploaded by

ABSTRACT

1.1 Problem Statement

2.1 Existing Work

2.2 Limitations of Existing Work

1.​ Lack of Foundational Understanding in Existing Implementations​

3.​ Overemphasis on Application, Underemphasis on Intuition​

4.​ High-Level Libraries Mask Model Behavior​

3.1 Overall Description

3.2 Operating Environment

●​ Operating System: Windows 10 / Windows 11 / Ubuntu 20.04+ / macOS (any modern

●​ Processor: Dual-core CPU (Intel i3 or equivalent, minimum)

1.​ Data Handling

2.​ Model Construction and Configuration

3.​ Training and Optimization

4.​ Visualization and Debugging

5.​ Model Evaluation and Testing

3.4 Non-Functional Requirements

2.​ Modularity and Maintainability

5.​ Reproducibility and Transparency

4.1 Use Case Diagram

5.1 Sample Code

def get_vector(seed = 1, upper_bound = 10, n_samples = 10):

def and_gate_dataset(positive_samples = 100, seed = 1000):

dataset = np.hstack((features, labels))

def get_minibatch(features, targets, batch_size = 1, start_at = 0):

def standardize_data(features, include_indices = [], exclude_indices = [], from_means = None,

def leaky_relu(leak = 0.1, alpha = 1):

def der_leaky_relu(leak = 0.1, alpha = 1):

def der_loss(self, label, output):

def error_output_layer(self, layer, label):

predictions = np.clip(predictions, epsilon, 1 - epsilon) # avoiding values of 0 ( => inf loss)

def __init__(self, n_neurons, activation = None, der_activation = None):

def __init__(self, layer_1, layer_2, seed = 1000):

def init_weights(self, destination_neurons, source_neurons, seed): # source_neurons = fan_in

weights = generator.standard_normal((destination_neurons, source_neurons)) * std

def __init__(self, loss, seed = 1000):

def add_layer(self, layer):

self.weights = list() # allows re-compilation

for layer in self.layers:

def gradient_weights(self, weight_index):

def gradient_biases(self, layer_index):

def error_output_layer(self, label):

def update_biases(self, layer_index, l_rate):

def update_weights(self, weight_index, l_rate):

fig.savefig(dir + name, bbox_inches = "tight")

def __init__(self, model, optimizer):

def backward_pass(self, label): # single instance

def forward_pass(self, input_data):

def confusion_matrix(self, labels, predictions):

def update_biases(self, layer_index):

def update_weights(self, weight_index):

7.​ classification.ipynb (AND gate)

from nn.model_classes import Model, Layer

trainer = Trainer(model, SGD())

X_train, y_train = and_gate_dataset(100, 1)

X_test, y_test = and_gate_dataset(50, 2)

_ = trainer.predict(X_test, y_test, True)

8.​ dataset_classification.ipynb (Kaggle – classification)

dataset = pd.get_dummies(dataset, columns = ["Class"], prefix = "", prefix_sep = "", drop_first

D_train, D_test = split_classes(dataset, 'Kecimen')

X_train = D_train.drop(columns = ['Kecimen'])

trainer = Trainer(model, SGD())

trainer.train(X_train, y_train, 16, 0.02, 20)

plotter.plot_gradients("./plots", "dataset_classification", 700)

9.​ dataset_regression.ipynb (Kaggle – regression)

dataset = pd.get_dummies(dataset, drop_first = True, dtype = int)

D_train, D_test = split_data(dataset, 0.3, 10)

X_train = D_train.drop(columns = ['charges'])

trainer.train(X_train, y_train, 4, 0.02, 25)

The testing approach involved:

●​ Verifying outputs of individual components (such as activation and loss functions).

6.1 Test Cases

Test Case-1 Forward Pass Output (AND Gate)

Input Four 2D binary combinations: [0,0], [0,1], [1,0], [1,1]

Test Case-2 Gradient Flow Consistency

Description Ensure that gradients computed during backpropagation decrease in

1. Lack of Foundational Understanding in Existing Implementations

3. Overemphasis on Application, Underemphasis on Intuition

4. High-Level Libraries Mask Model Behavior

● Operating System: Windows 10 / Windows 11 / Ubuntu 20.04+ / macOS (any modern

● Processor: Dual-core CPU (Intel i3 or equivalent, minimum)

1. Data Handling

2. Model Construction and Configuration

3. Training and Optimization

4. Visualization and Debugging

5. Model Evaluation and Testing

2. Modularity and Maintainability

5. Reproducibility and Transparency

def init(self, n_neurons, activation = None, der_activation = None):

def init(self, layer_1, layer_2, seed = 1000):

def init(self, loss, seed = 1000):

def init(self, model, optimizer):

7. classification.ipynb (AND gate)

8. dataset_classification.ipynb (Kaggle – classification)

9. dataset_regression.ipynb (Kaggle – regression)

● Verifying outputs of individual components (such as activation and loss functions).

1. Enhancing the core system:

● Implementing advanced optimization algorithms from scratch, such as Adam, RMSProp,

2. Application-focused extensions: