0% found this document useful (0 votes)
43 views

Unit - 2

Uploaded by

sakthi vel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views

Unit - 2

Uploaded by

sakthi vel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Unit- II

General Linear Model

Introduction

The General linear model is a unification of both linear and non-linear regression models that
also allows the incorporation of non-normal response distributions. In a GLM, the response
variable distribution must only be a member of exponential family, which includes the
normal, Poisson, binomial, exponential and gamma distributions as members. Furthermore
the normal error linear model is just a special case of GLM, so in many ways, the GLM can
be thought of as a unifying approach to many aspects of empirical modelling and data
analysis.

Components of General linear model

The General Linear Model (GLM) is a flexible framework used in statistics to model
relationships between dependent variables and one or more independent variables. Its
components typically include:

1. Dependent Variable (Y): This is the variable that you are trying to predict or explain.
In statistical notation, it is often denoted as Y.
2. Independent Variables (X): These are the explanatory variables that are
hypothesized to have an effect on the dependent variable. In statistical notation, these
are often denoted as X 1 , X 2 ,⋯ X p, where p is the number of independent variables.
3. Linear Combination: The relationship between the dependent variable and the
independent variables is expressed as a linear combination, often written as:

Y = β0 + β 1 X 1 + β 2 X 2 +⋯+ β p X p +ϵ

Where β 0 , β 1 ,⋯ , β p are the coefficients (parameters) that represent the effect of each
independent variable, and ϵ\epsilonϵ is the error term representing the variability in Y
that is not explained by the model.

4. Error Term (ε): This accounts for the variability in the dependent variable that is not
explained by the independent variables included in the model. It is assumed to be
normally distributed with mean 0 and constant variance.
5. Assumptions: The GLM typically assumes that the errors (ε) are independent,
normally distributed with constant variance (homoscedasticity), and that the model is
correctly specified (no omitted variables bias).
6. Estimation Method: The parameters (coefficients) in the GLM are often estimated
using least squares methods, maximum likelihood estimation, or other techniques
depending on the specific context and assumptions.
7. Hypothesis Testing: Inferences about the parameters (such as testing hypotheses
about their values) are often conducted using methods like t-tests, F-tests, or
likelihood ratio tests.
These components make up the basic structure of the General Linear Model, which is a
foundation for many statistical techniques, including linear regression, analysis of variance
(ANOVA), analysis of covariance (ANCOVA), and others

Applications:

The GLM is widely applicable across different fields and is foundational to various statistical
techniques, including:

 Linear Regression: When Y is continuous and normally distributed.


 Analysis of Variance (ANOVA): When Y is categorical and X is categorical or a
grouping variable.
 Analysis of Covariance (ANCOVA): Extends ANOVA by including continuous
covariates X.
 Multivariate Analysis of Variance (MANOVA): When there are multiple dependent
variables.

Binomial Logit Model:

The Binomial Logit Model is a specific type of Generalized Linear Model (GLM) used for
modelling binary outcomes, where the response variable Y takes on values of 0 or 1. It is
particularly useful when analysing situations where the outcome is binary, such as
success/failure, yes/no, or presence/absence.

Components of the Binomial Logit Model:

1. Dependent Variable (Y):


o Y represents the binary outcome variable. For example, Y=1 might denote a
success (e.g., a customer making a purchase), and Y=0 denotes failure (e.g., a
customer not making a purchase).
2. Independent Variables (X):
o These are predictor variables that are hypothesized to influence the probability
of the binary outcome Y. These predictors can be continuous, categorical, or a
mix of both.
3. Logit Link Function:
o In the Binomial Logit Model, the relationship between the probability of
success (or the expected value of Y) and the predictors is modelled using the
logit function:

logit ( p )=log ( 1−p p )=β + β X + β X + ⋯+ β X


0 1 1 2 2 p p

Where p=P (Y =1∨ X ¿ ¿ 1 , X 2 , ⋯ X p )¿ is the probability of Y=1, and


β 0 , β 1 ,⋯ , β p are the coefficients (log-odds or logits) that represent the effect of
each X variable on the log-odds of success.
Probit link function as popular choice of inverse cumulative distribution function
The inverse of any continuous cumulative distribution function (CDF) can be used for
the link since the CDF's range is [0,1], the range of the binomial mean. The normal CDF Φ is
a popular choice and yields the probit model. Its link is

−1
g( p)=Φ ( p )

The reason for the use of the probit model is that a constant scaling of the input
variable to a normal CDF (which can be absorbed through equivalent scaling of all of the
parameters) yields a function that is practically identical to the logit function, but probit
models are more tractable in some situations than logit models.

You might also like