Project Report 4th Year
Project Report 4th Year
This project focuses on building a neural network entirely from scratch using NumPy, with the
primary aim of gaining a deep understanding of the mathematical principles and algorithms that
power modern machine learning models—most notably, the backpropagation algorithm. Unlike
popular machine learning libraries such as TensorFlow, PyTorch, or Keras that abstract away
much of the underlying complexity, this project exposes and implements the inner workings of
neural networks manually. The network includes key components such as forward and backward
passes, gradient-based optimization, and parameter updates, all of which were constructed
without the use of high-level machine learning APIs. Through this hands-on approach, we
demystify how neural networks actually "learn" from data.
To evaluate and visualize the learning process, the implemented model was tested on both simple
(logic gate) and more complex datasets obtained from Kaggle, covering both classification and
regression tasks. A wide variety of plots were generated—including weight trajectories, gradient
updates, accuracy/R² scores, and contour plots of the loss landscape—to illustrate the behavior
and dynamics of training. Additionally, Principal Component Analysis (PCA) was implemented
from scratch to visualize changes in the model’s output across epochs. This comprehensive
exploration not only validates the effectiveness of our custom-built neural network but also
strengthens our conceptual and practical grasp of foundational machine learning techniques.
II
Table of Contents
Page No.
1 Introduction
1.1 Problem statement 1
1.2 Motivation 1
1.3 Objectives 2
2 Literature Survey
2.1 Existing work 3
2.2 Limitations of Existing work 4
3 Software Requirements Specifications
3.1 Overall Description 6
3.2 Operating Environment 6
3.3 Functional Requirements 7
3.4 Non-Functional Requirements 7-8
4 Design
4.1 Use Case Diagram 9
4.2 Class Diagram 10
4.3 Sequence Diagram 11
4.4 Data Flow Diagram 12
4.5 System Architecture 13
5 Implementation
5.1 Sample Code 14-26
6 Testing
6.1 Test Cases 27-28
7 Screenshots
7.1 AND gate dataset – Classification Task 29-31
7.2 Kaggle dataset – Classification Task 32-34
7.3 Kaggle dataset – Regression Task 34-36
7.4 Project Structure 37
8 Conclusion & Future scope 38-39
References 40-41
III
1. INTRODUCTION
This abstraction creates a gap in understanding the fundamental mechanics of neural networks,
especially for learners and practitioners aiming to grasp how learning actually occurs at the level
of individual operations—such as computing gradients, updating weights, and propagating
errors. Without direct exposure to these internal processes, it becomes challenging to build a
strong foundational intuition about model behavior, performance, and limitations.
1.2 Motivation
The primary motivation behind this project is to develop a deeper, hands-on understanding of
how neural networks operate at a mathematical and algorithmic level—beyond the abstraction
offered by modern libraries. By implementing each component manually, especially the
backpropagation algorithm, we aimed to bridge the conceptual gap left by high-level tools. This
exercise not only reinforces our theoretical knowledge but also enhances our ability to diagnose,
interpret, and optimize models in practical machine learning tasks by understanding what
happens beneath the surface during training.
1
1.3 Objectives
The goal of this project is to break away from high-level abstractions and understand, at a
granular level, how neural networks learn from data. To achieve this, we defined the following
specific objectives:
● To implement a feedforward neural network from scratch using only NumPy, covering
core components such as forward pass, backward pass, and parameter updates.
● To study and manually apply the mathematical foundations behind the backpropagation
algorithm, including calculus-based gradient derivations.
● To visualize the internal behavior of the model during training using detailed plots of
gradients, weights, scores, and loss landscapes.
● To test the neural network on both simple datasets (e.g., logic gates) and more complex
datasets, across classification and regression tasks.
2
2. LITERATURE SURVEY
Mellah et al. [1] proposed a neural network-based estimator for brushed DC machines capable
of simultaneously predicting speed, armature temperature, and resistance using only voltage and
current measurements. To enhance learning efficiency, the authors implemented a
Cascade-Forward Neural Network (CFNN) trained using the Resilient Backpropagation (RBP)
algorithm, known for its fast convergence and robustness. Unlike traditional estimators—which
typically target a single parameter and are often prone to instability and noise sensitivity—the
proposed method eliminates the need for physical speed and thermal sensors. Comparative
results demonstrated that the neural estimator closely matches model predictions and offers
improved estimation accuracy, making it suitable for thermal monitoring and high-performance
motor drive applications.
Bülte et al. [2] introduced a graph neural network (GNN)-based framework to post-process
ensemble precipitation forecasts with a focus on extreme weather events. Unlike traditional
post-processing methods that often overlook complex spatial dependencies and tail behaviors in
precipitation data, this approach directly targets extremes in the distribution. By leveraging the
structure of GNNs, the model captures spatial correlations and enhances the accuracy of
probabilistic forecasts, particularly for extreme precipitation. Experimental comparisons showed
improved performance over standard baselines, highlighting the framework’s potential in
reducing flood risks and informing climate resilience strategies. Future directions include
integrating this approach with end-to-end forecasting systems and refining extreme-value
modeling techniques.
Karabayir et al. [3] introduced a novel optimization approach called the Evolved Gradient
Direction Optimizer (EVGO), designed to address the vanishing gradient issue commonly
encountered in training deep neural networks (DNNs). The method leverages both first-order
gradients and a specially constructed hyperplane to guide weight updates. The authors
benchmarked EVGO against several established gradient-based optimizers, including Adam,
RMSProp, and gradient descent, across datasets such as MNIST, CIFAR-10, and CIFAR-100
using well-known architectures like AlexNet and ResNet. Their experiments showed that EVGO
consistently outperformed the other optimizers in terms of accuracy and convergence, even in
deeper or narrower networks.
3
Yang et al. [4] proposed a hybrid training strategy called GEMONN, which integrates gradient
information into evolutionary algorithms to improve the training of deep neural networks
(DNNs). The key innovation lies in a specialized genetic operator that guides the search using
gradient directions while also optimizing for network sparsity, helping reduce complexity and
prevent overfitting. Through experiments on various architectures — including autoencoders,
LSTMs, and CNNs — GEMONN demonstrated superior performance compared to both
traditional evolutionary methods and standard gradient-based optimizers like SGD and Adam.
Although it showed slightly lower performance on CNNs, the approach still offers strong
potential, especially for resource-constrained environments where sparse networks are beneficial.
Na et al. [5] addressed the vanishing gradient problem in deep neural networks (DNNs) by
integrating batch normalization (BN) layers before each sigmoid activation layer. Their approach
was specifically applied to the modeling of microwave components, a domain known for its
complex non-linear behavior. By normalizing layer inputs with additional scaling and shifting,
the BN layers improved gradient flow and training stability. Additionally, they employed an
Automated Model Generation (AMG) algorithm to dynamically configure the network
architecture, including the number of hidden and BN layers. This combination of BN and AMG
contributed to a more robust and efficient training process for deep networks in engineering
applications.
Li et al. [6] investigated pruning techniques for Binary Neural Networks (BNNs), which are
already compact due to their binary weights and activations. Recognizing that existing pruning
methods for full-precision networks are unsuitable for BNNs, the authors introduced a novel
pruning strategy based on weight flipping frequency — a measure of how often weights change
during training. This metric serves as an indicator of a weight's sensitivity to model accuracy.
Experiments on binary versions of AlexNet and a 9-layer Network-in-Network (NIN), using the
CIFAR-10 dataset, demonstrated that their method could reduce binary operations by 20–40%
with only minimal accuracy loss. The approach also achieved significant runtime improvements,
highlighting its effectiveness for optimizing BNNs without sacrificing performance.
4
2. Limited Exploration of Neural Network Construction from Scratch
Few studies focus on implementing neural networks from the ground up using only low-level
tools like NumPy. This leaves a gap in hands-on understanding of core components such as
forward pass, backpropagation, loss functions, and optimization routines.
5
3. SOFTWARE REQUIREMENTS SPECIFICATION
This project implements a neural network system from scratch using NumPy. The system is
composed of modular components, including classes for building models, layers, optimizers,
training routines, and plotting tools. It supports classification and regression tasks, allows custom
dataset integration, and includes visualization capabilities to monitor various training metrics.
The intended users of this system are students, educators, or developers who want to study or
demonstrate the internal workings of neural networks in a transparent and configurable
environment.
This specification outlines the functional and non-functional requirements of the system, its
operating conditions, and a summary of its design structure.
Software Requirements
Hardware Requirements
6
3.3 Functional Requirements
The system should provide the following core functionalities for users working within Python
notebooks or scripts:
1. Usability
The system should be intuitive and user-friendly for individuals familiar with Python
programming and Jupyter notebooks. Users should be able to interact with the system by
importing its components as Python modules and using them with minimal setup. The codebase
should follow clear naming conventions and be well-documented to support learning and
experimentation. The interface should encourage educational use and provide clarity over
automation.
7
affecting others. The architecture should support long-term maintainability and allow new
features or improvements to be integrated with minimal disruption.
3. Performance
The system should perform efficiently for small to medium-sized datasets on CPU-based
machines. It should support mini-batch training to manage computational load and allow
flexibility in tuning performance during training. While not optimized for large-scale use, the
system should maintain acceptable responsiveness and throughput during experimentation.
4. Scalability
Although the system is not intended for industrial-scale deployment, it should be scalable within
the context of academic or prototype-level tasks. It should allow users to build deeper models,
modify layer configurations, and process larger batches without requiring structural code
changes. The design should make it possible to scale complexity upward for controlled
experimentation.
8
4. DESIGN
9
4.2 Class Diagram
10
4.3 Sequence Diagram
11
4.4 Data Flow Diagram
12
4.5 System Architecture
13
5. IMPLEMENTATION
1. dataset_utils.py
import numpy as np
from numpy.random import default_rng
import pandas as pd
14
return features[start_at:min(features.shape[0], start_at + batch_size)],
targets[start_at:min(targets.shape[0], start_at + batch_size)]
2. functions.py
import numpy as np
@np.vectorize
def relu(x):
return x if x > 0 else 0
@np.vectorize
def sigmoid(x):
15
return 1 / (1 + np.exp(-x))
@np.vectorize
def round_off(x):
return 1 if x >= 0.5 else 0
@np.vectorize
def mirror(x):
return x
class MSE:
def calculate_loss(self, labels, predictions):
labels = labels.reshape(predictions.shape)
return (1 / labels.size) * np.sum((labels - predictions)**2)
class BinaryLoss(MSE):
def calculate_loss(self, labels, predictions, epsilon = 1e-7):
total = predictions.size
labels = labels.reshape(predictions.shape)
summation = 0
def der_loss(self, label, output, epsilon = 1e-7): # output belongs to range [0, 1]
output = np.clip(output, epsilon, 1 - epsilon)
return ((1 - label)/(1 - output)) - (label / output)
16
3. model_classes.py
import numpy as np
from numpy.random import default_rng
from math import sqrt
import os
class Layer:
def init_biases(self):
self.b_ = np.zeros((self.n_neurons, 1))
def activate(self):
self.a_ = self.activation(self.z_)
def der_activate(self):
return self.der_activation(self.z_)
class Weights:
17
generator = default_rng(seed)
class Model:
def compile(self):
generator = default_rng(seed = self.seed)
seeds = generator.integers(0, len(self.layers)*100, (len(self.layers)-1,))
4. optimizers.py
import numpy as np
# optimizer will give the trainer the gradients[] matrix to apply to the weights.
class SGD():
def set_model(self, model):
self.model = model
18
def current_gradient(self, weight_index):
return np.matmul(self.model.weights[weight_index].layer_2.del_,
np.transpose(self.model.weights[weight_index].layer_1.a_))
def error_layer(self, this_index, weight_index): # weights connecting this layer to next layer
return np.matmul(
np.transpose(self.model.weights[weight_index].matrix),
self.model.layers[this_index+1].del_) * self.model.layers[this_index].der_activate()
def on_pass(self):
# reset gradients
for weight in self.model.weights:
weight.gradients = np.zeros((weight.rows, weight.cols))
# reset errors
for layer in self.model.layers:
layer.del_ = np.zeros((layer.n_neurons, 1))
layer.b_gradients = np.zeros((layer.n_neurons, 1))
5. plotter.py
import numpy as np
from numpy.random import default_rng
import matplotlib.pyplot as plt
from nn.trainer import Logger
import os, time
19
class Plotter:
def read_file(self, path):
self.data = Logger.load_data(path)
if self.data is None:
print("unsucessful.")
else:
print("file read.")
def plot_gradients(self, dir, name = None, n_points = None):
if not os.access(dir, os.F_OK):
print(f'cannot access {dir}')
return
time_str = time.strftime("%I-%M-%S_%p", time.localtime(time.time()))
name = "/gradients_" + (time_str if name is None else name) + ".png"
weights_gradients = dict()
bias_gradients = dict()
for i in range(self.data['n-epochs']):
epoch = f'epoch-{i}'
for j in range(self.data[epoch]['n-updates']):
update = f'update-{j}'
for w in self.data['n_weights']:
weight = f'weights-gradient-{w}'
if weight not in weights_gradients:
weights_gradients[weight] = list()
array = np.asarray(self.data[epoch][update][weight]).ravel()[:4]
if array.shape[0] > 1:
array = array.reshape((2, -1))
else:
array = array.reshape((1, 1))
weights_gradients[weight].append(array)
for b in self.data['n_weights']:
b += 1
bias = f'bias-gradient-{b}'
if bias not in bias_gradients:
bias_gradients[bias] = list()
array = np.asarray(self.data[epoch][update][bias]).ravel()[:4]
array = array.reshape((-1, 1))
bias_gradients[bias].append(array)
20
fig, axs = plt.subplots(2, len(self.data['n_weights']), figsize = (7 *
len(self.data['n_weights']), 8), gridspec_kw = {'wspace': 0.2, 'hspace': 0.3}, squeeze = False)
fig.suptitle(f'Gradients', size = 'xx-large')
# weights
for ax in range(len(self.data['n_weights'])):
weight = f"weights-gradient-{self.data['n_weights'][ax]}"
weights_gradients[weight] = np.asarray(weights_gradients[weight]) # convert entire
history of gradients into one numpy array
if n_points is not None:
weights_gradients[weight] = weights_gradients[weight][:n_points]
for r in range(weights_gradients[weight][0].shape[0]):
for c in range(weights_gradients[weight][0].shape[1]):
axs[0, ax].plot(weights_gradients[weight][:, r, c], label = f'{r*2 + c}', linewidth =
0.7)
axs[0, ax].legend(title = 'elements')
axs[0, ax].set_title(f"weights-{self.data['n_weights'][ax]}")
axs[0, ax].set_xlabel("Updates")
# bias
for ax in range(len(self.data['n_weights'])):
bias = f"bias-gradient-{self.data['n_weights'][ax] + 1}"
bias_gradients[bias] = np.asarray(bias_gradients[bias])
if n_points is not None:
bias_gradients[bias] = bias_gradients[bias][:n_points]
for r in range(bias_gradients[bias][0].shape[0]):
axs[1, ax].plot(bias_gradients[bias][:, r, 0], label = f'{r}', linewidth = 0.7)
axs[1, ax].legend(title = 'elements')
axs[1, ax].set_title(f"bias-{self.data['n_weights'][ax] + 1}")
axs[1, ax].set_xlabel("Updates")
6. trainer.py
import numpy as np
from numpy.random import default_rng
from nn.dataset_utils import get_minibatch
from nn.functions import round_off
21
import json, os, time
class Trainer:
for i in range(len(self.model.weights)):
self.current_gradient(i)
for i in range(stack.shape[0]):
if stack[i][0] == 1:
if stack[i][1] == 1:
matrix['tp'] += 1
else:
matrix['fn'] += 1
else:
if stack[i][1] == 1:
matrix['fp'] += 1
22
else:
matrix['tn'] += 1
return matrix
model = Model(BinaryLoss(), 1)
model.add_layer(Layer(2))
model.add_layer(Layer(2, leaky_relu(), der_leaky_relu()))
model.add_layer(Layer(1, sigmoid, der_sigmoid))
model.compile()
plotter = Plotter()
model.compile()
print('weights:')
model.show_weights()
print('biases:')
model.show_biases()
23
trainer.train(X_train, y_train, 1, 0.02, epochs = 120)
trainer.save_history('./logs', 'batch_size_1')
model.save_weights('./models', 'batch_size_1')
plotter.read_file('./logs/batch_size_1.txt')
plotter.plot_gradients('./plots', 'gradients_batch_size_1', 700)
plotter.plot_weights('./plots', 'weights_batch_size_1', 700)
plotter.plot_score('./plots', 'accuracy_batch_size_1', 700)
import pandas as pd
import numpy as np
from nn.model_classes import Model, Layer
from nn.functions import BinaryLoss, leaky_relu, der_leaky_relu, sigmoid, der_sigmoid
from nn.trainer import Trainer
from nn.plotter import Plotter
from nn.dataset_utils import pca, standardize_data, split_classes
from nn.optimizers import SGD
dataset = pd.read_csv(r'datasets\raisin_Sruthi.csv')
dataset.info()
X_train = X_train.to_numpy()
X_means, X_stds = standardize_data(X_train)
X_train
model = Model(BinaryLoss(), 5)
model.add_layer(Layer(7))
model.add_layer(Layer(7, leaky_relu(), der_leaky_relu()))
model.add_layer(Layer(7, leaky_relu(), der_leaky_relu()))
model.add_layer(Layer(7, leaky_relu(), der_leaky_relu()))
model.add_layer(Layer(1, sigmoid, der_sigmoid))
24
model.compile()
y_train = y_train.to_numpy()
y_train[:5]
trainer.save_history('./logs', 'dataset_classification')
model.save_weights("./models", "dataset_classification")
plotter = Plotter()
plotter.read_file(r'logs\dataset_classification.txt')
import pandas as pd
from nn.model_classes import Model, Layer
from nn.functions import MSE, leaky_relu, der_leaky_relu, sigmoid, der_sigmoid, mirror,
der_mirror
from nn.trainer import RegressionTrainer
from nn.plotter import Plotter
from nn.dataset_utils import pca, standardize_data, split_data
from nn.optimizers import SGD
dataset = pd.read_csv(r'datasets\expenses_Sruthi.csv')
dataset.info()
model = Model(MSE(), 5)
model.add_layer(Layer(8))
model.add_layer(Layer(8, leaky_relu(), der_leaky_relu()))
model.add_layer(Layer(8, leaky_relu(), der_leaky_relu()))
model.add_layer(Layer(1, mirror, der_mirror))
model.compile()
25
trainer = RegressionTrainer(model, SGD())
X_train = X_train.to_numpy()
y_train = y_train.to_numpy()
y_train = y_train.reshape((y_train.shape[0], 1))
X_means, X_stds = standardize_data(X_train, [0, 1])
y_means, y_stds = standardize_data(y_train)
trainer.save_history('./logs', 'dataset_regression')
plotter = Plotter()
plotter.read_file(r'logs\dataset_regression.txt')
plotter.plot_gradients("./plots", "dataset_regression", 700)
plotter.plot_weights("./plots", "dataset_regression", 700)
plotter.plot_score("./plots", "dataset_regression", 700, False)
26
6. TESTING
To ensure the correctness and reliability of the neural network system, several forms of testing
were conducted. Given that this is not a traditional software application with UI elements or
discrete functional modules, the focus of testing shifted toward verifying the correctness of
mathematical computations (forward and backward passes), observing learning behavior over
time, and confirming that the system could make accurate predictions on given datasets.
Description Check that a trained model can output correct values for AND gate
inputs after training.
Expected Output Approximately 0 for first three inputs, and near 1 for [1,1]
Observed Result Final predictions after training: [0.03, 0.06, 0.07, 0.95]
Input A small classification dataset (AND gate, batch size 1) trained over 100
epochs.
27
Expected Output Smooth, converging gradient trajectories; no abrupt spikes.
Observed Result Plots of gradients confirmed a steady downward trend and numerical
stability during updates.
Expected Output Training and test accuracy should be >85% with no overfitting or
underfitting symptoms.
Description Check that loss value decreases consistently across epochs during
training.
Input Any dataset with a known ground truth; typically trained over 100+
epochs.
Expected Output Smooth downward loss curve; convergence after sufficient iterations.
Observed Result All training sessions produced typical exponential decay loss patterns;
no signs of divergence or instability.
28
7. SCREENSHOTS
This section presents a collection of visual outputs and screenshots that highlight key aspects of
the project. The majority of the images consist of graphs generated during model training, which
served as critical tools for observing and validating the behavior of the neural network.
Additionally, this section includes selected screenshots of the development environment (VS
Code and Jupyter notebooks), showcasing how different modules interact during training and
evaluation. Together, these visuals offer a comprehensive look into both the internal dynamics of
the neural network and the engineering effort behind the system.
29
30
31
7.2 Kaggle dataset – Classification Task
32
33
7.3 Kaggle dataset – Regression Task
34
35
36
7.4 Project Structure (VS Code)
37
8. CONCLUSION
The primary goal of this project was to gain a foundational understanding of neural networks by
implementing one from scratch without relying on high-level machine learning libraries.
Through this process, we successfully constructed a fully functional feedforward neural network
using only NumPy and Python, carefully designing every component including the forward pass,
backpropagation, loss computation, gradient updates, and training loop. The network was
evaluated on both simple and moderately complex datasets, where it demonstrated strong
learning capabilities and competitive performance. Visual tools played a central role in our
workflow, providing crucial insights into the training dynamics and helping us verify the
correctness of the implementation.
This hands-on approach allowed us to go beyond the surface-level use of popular frameworks
and engage directly with the mathematical and algorithmic core of deep learning. We not only
implemented the foundational equations but also handled data preprocessing, training utilities,
modular design, and real-time monitoring. This comprehensive experience helped solidify our
understanding of key concepts such as weight initialization, gradient descent, overfitting, and
loss landscapes. Overall, the project was successful in achieving its pedagogical objectives while
also delivering a practical and testable neural network system.
Future Scope
While the current project successfully delivered a functional neural network system built from
first principles, there remains significant room for both horizontal and vertical expansion. Future
work can further improve the system’s learning capabilities, flexibility, and applicability across
different problem domains — all while remaining grounded in the context of fundamental
feedforward neural networks.
38
● Adding support for early stopping, dynamic learning rate schedules, and other training
control mechanisms.
● Designing an online learning module that allows the model to train incrementally on
streaming data, enabling use in dynamic environments such as anomaly detection.
● Developing a use-case-specific application — for example, a cyberattack detection
system — where data is generated or labeled in real time, simulating a high-stakes
decision-making scenario.
● Applying the network to sensor-based domains such as predictive maintenance or activity
recognition, where the temporal structure of incoming data can still be explored with
feedforward models using engineered features.
● Building a simple graphical interface or API endpoint so that non-technical users can
input data and view model predictions interactively.
These extensions would not only make the system more robust and versatile but also highlight
the potential of even basic neural architectures when carefully engineered and applied to
domain-specific challenges.
39
REFERENCES
[1] Mellah, Hacene, Kamel Eddine Hemsas, and Rachid Taleb. "Cascade-Forward Neural
Network Based on Resilient Backpropagation for Simultaneous Parameters and State Space
Estimations of Brushed DC Machines." arXiv preprint arXiv:2104.04348 (2021).
[2] Bülte, Christopher, et al. "Graph Neural Networks for Enhancing Ensemble Forecasts of
Extreme Rainfall." arXiv preprint arXiv:2504.05471 (2025).
[3] Karabayir, Ibrahim, Oguz Akbilgic, and Nihat Tas. "A novel learning algorithm to optimize
deep neural networks: Evolved gradient direction optimizer (EVGO)." IEEE transactions
on neural networks and learning systems 32.2 (2020): 685-694.
[4] Yang, Shangshang, et al. "A gradient-guided evolutionary approach to training deep neural
networks." IEEE Transactions on Neural Networks and Learning Systems 33.9 (2021):
4861-4875.
[5] Na, Weicong, et al. "Deep neural network with batch normalization for automated
modeling of microwave components." 2020 IEEE MTT-S International Conference on
Numerical Electromagnetic and Multiphysics Modeling and Optimization (NEMO). IEEE,
2020.
[6] Li, Yixing, and Fengbo Ren. "Bnn pruning: Pruning binary neural network guided by
weight flipping frequency." 2020 21st International Symposium on Quality Electronic
Design (ISQED). IEEE, 2020.
Bibliography
● Documentation: NumPy – https://numpy.org/doc/2.2/
● Documentation: Matplotlib – https://matplotlib.org/stable/index.html
● Documentation: Pandas – https://pandas.pydata.org/docs/
● Documentation: Python 3.9 – https://docs.python.org/3.9/
● Article: “Neural Networks Representation” –
https://www.jeremyjordan.me/intro-to-neural-networks/
● Article: “Mechanics of a Simple Neural Network” –
https://shotlefttodatascience.com/2020/08/17/the-mechanics-of-a-simple-neural-network/
● Article: “Neural Networks Structure” –
https://python-course.eu/machine-learning/neural-networks-structure-weights-and-matrices.p
hp
● Article: “Weight Initialization Techniques in Neural Networks” –
https://www.pinecone.io/learn/weight-initialization/
● Article: “How the Backpropagation Algorithm Works” –
http://neuralnetworksanddeeplearning.com/chap2.html
● Article: “Gradient Accumulation” –
https://medium.com/@harshit158/gradient-accumulation-307de7599e87
40
● Article: “Visualizing the Loss Landscape of a Neural Network” –
https://mathformachines.com/posts/visualizing-the-loss-landscape/#the-linear-case
● Article: “Principal Component Analysis” –
https://www.turing.com/kb/guide-to-principal-component-analysis#step-4:-feature-vector
● Kaggle Regression dataset: “Medical Insurance” –
https://www.kaggle.com/datasets/harshsingh2209/medical-insurance-payout
● Kaggle Classification dataset – “Raisin Binary Classification” –
https://www.kaggle.com/datasets/nimapourmoradi/raisin-binary-classification
41