0% found this document useful (0 votes)
25 views2 pages

Assignment #1

This document provides instructions for Assignment #1 on Bayesian Machine Learning for the course BITS F464. Students are asked to implement and test 4 Bayesian machine learning algorithms: Naive Bayes classifier, Bayesian belief networks, Bayesian linear regression, and Expectation Maximization clustering. Students can work in groups of up to 3 people. They must submit a report describing the algorithms and results, along with code files for each algorithm, in a single ZIP file named after their group. Students will be evaluated based on their understanding through a viva, not just the submitted work. All group members must attend the viva for grading.

Uploaded by

RITHU PRAKASH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views2 pages

Assignment #1

This document provides instructions for Assignment #1 on Bayesian Machine Learning for the course BITS F464. Students are asked to implement and test 4 Bayesian machine learning algorithms: Naive Bayes classifier, Bayesian belief networks, Bayesian linear regression, and Expectation Maximization clustering. Students can work in groups of up to 3 people. They must submit a report describing the algorithms and results, along with code files for each algorithm, in a single ZIP file named after their group. Students will be evaluated based on their understanding through a viva, not just the submitted work. All group members must attend the viva for grading.

Uploaded by

RITHU PRAKASH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE, PILANI

INSTRUCTION DIVISION
FIRST SEMESTER 2023-2024
BITS F464 – Machine Learning
Assignment #1
Weightage: 10% (20 Marks)
Due Date: 09/11/2023

Bayesian Machine Learning


Introduction*
Bayesian ML is a paradigm for constructing statistical models based on Bayes’
Theorem:
p(θ|x)=p(x|θ)p(θ)/p(x)
Generally speaking, the goal of Bayesian ML is to estimate the posterior distribution
𝑝(𝜃|𝑥) given the likelihood 𝑝(𝑥|𝜃) and the prior distribution, 𝑝(𝜃). The likelihood is
something that can be estimated from the training data.
In fact, that’s exactly what we’re doing when training a regular machine learning
model. We’re performing Maximum Likelihood Estimation, an iterative process which
updates the model’s parameters in an attempt to maximize the probability of seeing
the training data 𝑥 having already seen the model parameters 𝜃.
So how does the Bayesian paradigm differ? Well, things get turned on their head in
that in this instance we actually seek to maximize the posterior distribution which
takes the training data as fixed and determines the probability of any parameter
setting 𝜃 given that data. We call this process Maximum a Posteriori (MAP). It’s easier,
however, to think about it in terms of the likelihood function. By Bayes’ Theorem we
can write the posterior as
p(θ|x)∝ p(x|θ)p(θ)
Here, we leave out the denominator, 𝑝(𝑥), because we are taking the maximization
with respect to 𝜃 which 𝑝(𝑥) does not depend on. Therefore, we can ignore it in the
maximization procedure. The key piece of the puzzle which leads Bayesian models to
differ from their classical counterparts trained by MLE is the inclusion of the term
𝑝(𝜃). We call this the prior distribution over 𝜃.
The idea is that its purpose is to encode our beliefs about the model’s parameters
before we’ve even seen them. That’s to say, we can often make reasonable
assumptions about the “suitability” of different parameter configurations based simply
on what we know about the problem domain and the laws of statistics. For example,
it’s pretty common to use a Gaussian prior over the model’s parameters. This means
we assume that they’re drawn from a normal distribution having some mean and
variance. This distribution’s classic bell-curved shape consolidates most of its mass
close to the mean while values towards its tails are rather rare.
By using such a prior, we’re effectively stating a belief that most of the model’s weights
will fall in some narrow range about a mean value with the exception of a few outliers,
and this is pretty reasonable given what we know about most real-world phenomena.
It turns out that using these prior distributions and performing MAP is equivalent to
performing MLE in the classical sense along with the addition of regularization. There’s
a pretty easy mathematical proof of this fact that we won’t go into here, but the gist is
that by constraining the acceptable model weights via the prior we’re effectively
imposing a regularizer.
* Bayesian machine learning, September 3, 2020 by DataRobot
Implement the following algorithms:
1. Naive Bayes’ Classifier
2. Bayesian Belief Networks
3. Bayesian Linear Regression
4. Expectation Maximization Clustering

Group Information: You can work in groups of at most THREE (03)! Spread sheet in which you can
mark your group:

[Link]
6OHTxrBJlAXXU6PXQR56VQKWfnu_NP90wT6e6mY/edit?usp=sharing

What you need to submit?


Submit a report in which you describe the above algorithms. You also need to give details of the data
used and the results obtained. You also need to submit code files for each algorithm! Each group will
submit everything in a single ZIP file with name – GroupXX, where XX is your group number.

Grading: We will evaluate your submitted files and call you for a viva! You will be evaluated mainly
based on what you have understood and not based on what you have submitted. Mere submitting the
assignment (and not appearing for the viva) does not entitle you to any marks. All members of the
group need to be present for the viva and there will be differential marking.

Navneet Goyal

You might also like